Categorizing prediction modes within low-pLDDT regions of AlphaFold2 structures
This article has been Reviewed by the following groups
Listed in
- Evaluated articles (Arcadia Science)
Abstract
AlphaFold2 protein structure predictions are widely available for structural biology uses. These predictions, especially for eukaryotic proteins, frequently contain extensive regions predicted below the pLDDT 70 level, the rule-of-thumb cutoff for high confidence. This work identifies major modes of behavior within low-pLDDT regions through a survey of human proteome predictions provided by the AlphaFold Protein Structure Database. The near-predictive mode resembles folded protein and can be a nearly accurate prediction. Barbed wire is extremely unproteinlike, being recognized by wide looping coils, an absence of packing contacts, and numerous signature validation outliers, and it likely represents a nonpredicted region. Pseudostructure presents an intermediate behavior with a misleading appearance of isolated and badly formed secondary structure-like elements. These prediction modes are compared with annotations of disorder from MobiDB, showing general correlation between barbed wire/pseudostructure and many measures of disorder, an association between pseudostructure and signal peptides, and an association between near-predictive and regions of conditional folding. To enable users to identify these regions within a prediction, a new Phenix tool is developed encompassing the results of this work, including prediction annotation, visual markup, and residue selection based on these prediction modes. This tool will help users develop expertise in interpreting difficult AlphaFold predictions and identify the near-predictive regions that can aid in molecular replacement when a prediction does not contain enough high-pLDDT regions.
Article activity feed
-
Figure 1
I really enjoyed this paper taking a deeper look at the more disordered regions of AlphaFold structures (that tend to be ignored)! It's interesting to see that there actually is some categorizable structure in those regions and to think about what that might mean for using AlphaFold structures to learn more about proteins!
-
4.2. Sequence properties of prediction modes
Are there any residues or combos of residues that are more likely to be in the specific types of structure?
-
excluding unphysical due to its rarity.
How rare is this?
-
A well-packed and well-predicted core is surrounded by barbed wire and pseudostructure.
Any chance you could indicate the various types of structure in the full structure or even just where B-D are located?
-
This was largely an exercise in frustration – sequences that fold stably and behave well experimentally appear strongly correlated with sequences that AlphaFold2 predicts with high confidence. Finding experimental versions of near-predictive regions is rare because most residues deposited in the PDB have high-pLDDT AlphaFold counterparts.
Is this because the "near-predictive" regions are likely to be more dynamic and have a more transient structured state that's not captured as well in experimental structures?
-
Markup is green for Ramachandran outliers, red and blue for covalent geometry outliers, magenta for CaBLAM, lime green and yellow for cis and twisted peptide bonds.
It would be super helpful to have a legend on the figure for which colors are which. It's also hard to tell some of the colors apart.
-
Carbonyl oxygen bonds are frequently pointed in the same direction, rather than alternating as in beta strands. D: Ramachandran distribution for general-case residues in the Q86YZ3 fragment 6 prediction. Outliers are marked in purple. The distribution is highly unusual and clustered in the upper right of the plot, corresponding to an extended but unproteinlike conformation.
It's fine in the downloadable PDF, but in the full text, I think this bit is supposed to be at the end of the figure legend for figure 2. I also think 2D is really compelling! It makes me wonder what these plots would like for the other structure types/behaviors that you note?
-