Non-standard proteins in the lens of AlphaFold 3 - a case study of amyloids
Discuss this preprint
Start a discussion What are Sciety discussions?Listed in
This article is not in any list yet, why not save it to one of your lists.Abstract
While three-dimensional structures of globular and transmembrane forms are available for many amyloid proteins, structures of their amyloid forms are scarce in the PDB. Amyloids pose major challenges for both experimental structure determination and computational modelling.
We evaluated the amyloid-modelling performance of the current top modelling software, AlphaFold 3 (AF3), using three datasets. Dataset 1 contains 153 proteins and peptides that are known to form fibrils, but their 3D structures have not been experimentally determined. Dataset 2 contains 56 non-aggregating/non-amyloid peptides. Dataset 3 contains seven proteins for which the three-dimensional fibrillar structure is known.
Fibrillar structures were predicted for 34% of dataset 1, but unfortunately also for 54% of dataset 2. Fibrillar structures were successfully predicted for five out of seven proteins from dataset 3. Comparing AF3 with different methods, it outperformed Boltz, and predicted the structures of CsgA and α-synuclein more correctly than RibbonFold, whereas the latter predicted Aβ-42 better.
The performance of AF3 in prediction of amyloid structures for our datasets seems hindered by low abundance of amyloid structures in the PDB and high prevalence of structure data for their non-fibrillar forms. AF3 tends to assign a higher quality score to globular oligomeric models than to fibrillar ones. A correct amyloid structure prediction is more likely to be obtained for shorter fragments. The amyloid modelling quality of AF3 seems underwhelming, but it can still provide hypotheses about amyloid structures in some cases. Our work also suggests the steps needed to achieve a better performance in the near future.
Statement for a broader audience
Amyloid proteins can form stable, insoluble fibrils that are often related to a neurodegenerative disease. Knowledge of the three-dimensional structure of these fibrils is important, e.g. for a drug design. We evaluate the performance of AlphaFold 3 on the prediction of amyloid structures and observe that it struggles with these cases. The problems seem to arise mainly from the nature of the AlphaFold 3 training dataset and polymorphic nature of many amyloids. Although the results are underwhelming, AlphaFold 3 can sometimes provide valuable insights into amyloid protein structures, something that only a few years ago still seemed a very hard to reach goal.