The KFC Server
The KFC Server is a web-based implementation of the KFC model — a machine learning approach for predicting binding hot spots, or the subset of residues that account for most of a protein interface’s binding free energy. The server facilitates the automated analysis of a given protein interface and the visualization of its hot spot predictions. For each residue within the interface, the KFC Server characterizes the local structural environment surrounding the residue, compares it to known environments of experimentally determined hot spots, and predicts if the residue is a hot spot. The user can visualize the results using an interactive job viewer after the computational analysis is complete. In addition to standard molecular viewer functionality, the job viewer allows the user to quickly highlight predicted hot spots and surrounding structural features using a graphical interface.
The KFC model is comprised of two decision tree-based models: K-FADE (based on shape specificity features), and K-CON (based on biochemical contacts such as intermolecular hydrogen bonds and atomic contacts). Using a data set of experimental alanine-scanning mutations, each model is trained to recognize local structural environments that are indicative of hot spots. For this work, hot spots are defined as mutations associated with a change in binding energy (∆∆G) greater than 2 kcal/mol. The following journal article provides a complete discussion of the development and performance of the KFC Model:
Publications
Please cite the following in any work that uses the KFC Server:
- S. J. Darnell, D. Page, and J. C. Mitchell. Automated Decision-Tree Approach to Predicting Protein-Protein Interaction Hot Spots. Proteins, 68(4): 813-823, 2007.
- S. J. Darnell, L. Legault, and J. C. Mitchell. KFC Server: Interactive Forecasting of Protein Interaction Hot Spots. Nuc Acid Res, in press, 2008.
Running a KFC Analysis
Registration and Login
Users can register prior to submitting jobs to any of the tools hosted by the Mitchell Lab or submit jobs anonymously. Personal information is only used to contact users when their analysis is complete; it will not be shared. To register, enter a unique user name and email address on the registration page, then click the submit button. An error message will display if the selected user name is in use by another user.
Once registered, users may log in to the server. Although login is not required to submit jobs, it allows a user to view their personal jobs in the job viewer. Both the username and password are case sensitive. By default, a login will expire after two weeks; however, a user may manually logoff as well.
Submitting a Job
Before the KFC analysis can begin, a user must provide the structure of a protein complex and define the interface to analyze. A protein interface is the region between the two binding partners, where each partner is comprised of one or more proteins. Files that do not contain a bound complex are unlikely to yield useful results. In addition, model structures containing many clashes may vastly overestimate the number of hot spots. Finally, KFC is able to analyze structures containing proteins and DNA/RNA but not other types of molecules. Please remove these from your PDB file before submitting to the server.
To analyze an interface, enter the following information on the submission page and click the Submit button:
- Username — Registered user name (or guest)
- Email Address — Email address associated to user name (optional for guest login)
- Upload Complex or PDB Code — Link to PDB file of a macromolecular complex, or a PDB identification code (e.g. 1XYZ)
- Protein 1 Chainlist — List of chains defining the first binding partner (e.g. ABC)
- Protein 2 Chainlist — List of chains defining the second binding partner (e.g. DEF)
- Consurf Scores — Optional file containing Conserf conservation scores
- Rosetta AlaScan — Optional file containing Rosetta Alanine Scanning results
- Experimental Data — Optional file containing experimentally known hot spot data
Job Queue and Error Messages
Upon submission, the task will enter the job queue and wait for processing. The queue displays the current status for each submitted job (Queued, Active, View Results, or Error), and provides links to KFC input and output files. After processing begins, a typical KFC analysis finishes within two minutes. When the task is complete, an email is sent to the user with their KFC hot spot predictions or an error message. If the job finishes successfully, the status field will contain a link to the interactive job viewer. Jobs that end in error are described by the following error codes:
- Error 2: Error occurred while calculating shape specificity — This most likely occurs because the chain ID’s you have provided are incorrect and do not lead to a valid interface. Check the error file to see if it says “No atoms found.”
- Error 3: Error occurred while calculating biochemical contacts — This most likely occurs because there are atoms or groups that KFC does not recognize. Please make sure your PDB file contains only standard amino acids or nucleotides with atoms C, N, O, S, H. The error can also occur if the atom type is shifted into the wrong column. Check the error file to see if it complains of an unrecognized item and delete this.
- Error 4: Error occurred while predicting hot spots — This is unlikely to occur.
- Error 5: Error occurred while deleting temporary files — This is unlikely to occur.
Users can access KFC input, output, and error files by clicking on a job’s identification number. Most errors are caused by non-standard amino acids or ligands incorrectly labeled as ATOM records within the PDB coordinate file. If possible, the user should resolve the inconsistencies in the file and submit a new job. If subsequent jobs still end in error, users can contact admin@mitchell-lab.org for assistance.
Format of Hot Spot Predictions
- Chain — Chain identifier from PDB file
- Res — Amino acid residue name
- Num — Residue number from PDB file
- K-FADE Class — Predicted K-FADE classification
- K-FADE Conf — Confidence of K-FADE prediction (worst value is 0 and its best value is 1)
- K-CON Class — Predicted K-CON classification
- K-CON Conf — Confidence of K-CON prediction (worst value is 0 and its best value is 1)
- Consurf Class — Consurf conservation class (highly conserved or no)
- ConSu Value — Consurf conservation class (1-9)
- Rosetta Class — Predicted Rosetta classification (Hotspot if ∆∆G>2kcal/mol)
- Roset DDG — Predicted Rosetta ∆∆G value
- Exper Class — User-defined experimental hotspot class
- K-CON Conf — User-defined experimental data value
Using the KFC Viewer
The job viewer has two major components: a molecular viewer on the left, and a control panel on the right. Users can directly interact with the molecular viewer or use the control panel to affect the display. Each component is described in more detail below.
Control Panel: FADE Shape Markers
KFC uses the Fast Atomic Density Evaluator (FADE) to analyze the shape specificity within a protein-protein interface. Users can highlight different degrees of shape specificity clicking on the different color-coded checkboxes.
- Matched — Red, Orange
- Neutral — Yellow, Green
- Mismatched — Violet, Blue
Control Panel: Display Controls
These controls alter the appearance of the selected atoms. By default, KFC selects all protein atoms in the complex. Advanced users may change the atom selection by using the Jmol scripting language.
- Background — Change the color of the background
- Style — Change the representation of selected backbone or side-chain atoms
- Color — Change the color of the selected backbone or side-chain atoms
- Surface — Add a molecular surface to the selected atoms
- Show Selection — Highlight the current atom selection.
Additionally, users can save up to four different views of their session.
- Save — Record the current state of the display
- View — Restore the viewer to the saved state
Control Panel: Interface and Hot Spots
Each chain produces a unique group in the interface display.
- Show Chain — The checkbox by the chain name toggles whether the chain is displayed or hidden.
- Selection — The popup menu determines which subset of atoms is selected for action by the Display Controls (See Caveat #1).
The three checkboxes in each cell control the display of an interface residue.
- Checkbox #1 — Highlight the residue with space filling
- Checkbox #2 — Show the residue using sticks
- Checkbox #3 — Add a translucent surface around the residue
The coloring within each cell also encodes information about the residue.
- Background color — Chemical type (gray = hydrophobic, yellow = polar, red = acidic, blue = basic, purple = HIS or nucleic)
- Highlight color — Classification (pink = predicted hot spot, white = interface residue)
- Popup values — Hold the mouse over a name to see KFC and other scores for that sidechain
Control Panel: Miscellaneous Buttons
- Open Console — Opens the console, uses Rasmol commands (See See Caveat #1)
- PDB File — Opens the PDB file used by the molecular viewer
- Jmol Help — Opens the documentation for Jmol
- KFC Help — Opens the KFC Server instruction manual (this document)
Molecular Viewer: Jmol
Jmol is the molecular viewer used throughout the Mitchell Lab website. It is an applet written in Java, so users must enable Java and Javascript in their web browsers in order to use the KFC Server. Also, Windows users may need to install the most current Sun Java Runtime Environment (JRE) in order to use Jmol. Jmol is extensively documented, so we direct users to the following websites for information about its use.
Caveats
- If you use the console to make selections and change displays, the selections shown in the Control Panel may no longer be accurate. Actions taken using the console override any mouse-driven selection and display controls.