Scalable prediction of symmetric protein complex structures

Read the full article See related articles

Discuss this preprint

Start a discussion What are Sciety discussions?

Listed in

This article is not in any list yet, why not save it to one of your lists.
Log in to save this article

Abstract

All life relies on proteins to function, yet accurately modeling protein structures exceeding 10,000 amino acids remains extremely difficult. Existing solutions are limited to specific scenarios, require considerable computational resources, or are otherwise unscalable. Consequently, many large, disease-relevant protein complexes in the human proteome, as well as nearly all viruses and numerous other classes, are impractical to model with high fidelity for drug development. To modulate these protein complexes and viruses, structural information is eminently valuable, and often essential. In the last two years, machine learning based-tools that can generate binders to a given target structure with high hit rates have emerged. Combined with high-throughput screening, these technologies can far outpace traditional drug discovery. However, they cannot function well without accurate models of their target structures. Thus, to unlock the full power of AI-driven drug discovery, a scalable method must be developed to predict large protein complex structures. To overcome this bottleneck, we introduce Cosmohedra, a physics-based method to rapidly and accurately predict the structure of arbitrarily large, symmetric protein complexes. Validated across 4 major symmetry classes (icosahedral, tetrahedral, octahedral, and cyclic), the method consistently achieves near-experimental levels of accuracy, i.e., RMSD < 5°A. In test cases, the method runs in < 5 minutes on consumer hardware, 10 3 -10 5 times faster than the closest comparable software. The largest structure currently built, at ≈40,000 amino acids, is > 4 times the limits of existing machine learning and molecular dynamics-based methods. By dramatically increasing the speed and scale at which protein complex structures can be modeled, Cosmohedra represents a new step towards universal protein structure prediction and a valuable tool for protein engineering and drug development.

Article activity feed