AlphaFlex: Ensembles of the human proteome representing disordered regions
Discuss this preprint
Start a discussion What are Sciety discussions?Listed in
This article is not in any list yet, why not save it to one of your lists.Abstract
Over a third of residues in the canonical human proteome are predicted to fall within intrinsically disordered protein regions (IDRs), which do not adopt stable folded structures. These IDRs play critical roles in biological regulation and organization, including as targets for post-translational modifications, scaffolds and mediators of biomolecular condensates. To address the pressing need for valid structural models providing biological relevance and enabling functional insight, we developed the AlphaFlex workflow, using IDPConformerGenerator or IDPForge to calculate fully atomistic conformer ensembles for proteins predicted to have disordered regions, modeled in the context of highly confident folded domains from AlphaFold2. We illustrate our approach by generating conformational ensembles of the human proteins in the AlphaFold2 database, with completed AlphaFlex models deposited in the Protein Ensemble Database that is mirrored in UniProt. This transformative resource of AlphaFlex ensembles provides more realistic and biologically relevant full-length protein models for proteins with IDRs, which we illustrate for scaffold proteins with folded domains connected by IDRs, those with IDRs that interact with folded domains, regulatory and condensate proteins requiring exposed binding elements, and a conditionally folding IDR.