Extending Conformational Ensemble Prediction to Multidomain Proteins and Protein Complex

Read the full article See related articles

Discuss this preprint

Start a discussion What are Sciety discussions?

Listed in

This article is not in any list yet, why not save it to one of your lists.
Log in to save this article

Abstract

Proteins execute cellular functions through a structural continuum ranging from stable, folded domains to highly dynamic intrinsically disordered regions (IDRs). Conformational ensembles represent the set of three-dimensional structures a protein adopts under a specific set of conditions, and underlie essential processes from catalysis to complex signaling networks. While deep learning has revolutionized structure prediction, capturing the distinct conformational diversity of folded and disordered regions—especially within multidomain proteins and large assemblies—remains a fundamental challenge. Here we introduce IDPFold2, a generative framework that models the heterogenous protein thermodynamics by integrating a Mixture-of-Experts architecture into the flow matching framework. By routing residues from different regions to specialized expert networks, IDPFold2 accurately predicts conformational ensembles for folded domains, IDRs and multidomain proteins. IDPFold2 outperforms state-of-the-art methods in capturing key functional states and fitting the experimental observations across local and global scales. Furthermore, we describe an extension of IDPFold2 to protein assemblies, deciphering the complex binding modes of IDRs within large macromolecular complexes, providing a generalizable tool for exploring the dynamic proteome.

Article activity feed