Conditional Diffusion with Locality-Aware Modal Alignment for Generating Diverse Protein Conformational Ensembles
Listed in
This article is not in any list yet, why not save it to one of your lists.Abstract
Recent advances in AI have enabled the accurate prediction of a single stable protein structure solely based on its amino acid sequence. However, capturing the complete conformational landscape of a protein and its dynamic flexibility remains challenging. In this work, we developed Modal-aligned conditional Diffusion (Mac-Diff), a score based diffusion model for generating the conformational ensembles for unseen proteins. Central to Mac-Diff is an innovative attention module that enforces a delicate, locality-aware alignment between the conditional view (protein sequence) and the target view (residue pair geometry) to compute highly contextualized features for effective structural denoising. Furthermore, Mac-Diff leverages semantically rich sequence embedding from Protein Language Models like ESM-2 in enforcing the protein sequence condition that captures evolutionary, structural and functional information. This compensates for protein structural heterogeneity more effectively than embeddings from structure prediction models that are possibly biased to the dominant conformation. Mac-Diff showed promising results in generating realistic and diverse protein structures. It successfully recovered conformational distributions of fast folding proteins, captured multiple meta-stable conformations that were only observed in long MD simulation trajectories and efficiently predicted alternative conformations for allosteric proteins. We believe that Mac-Diff offers a useful tool to improve understanding of protein dynamics and structural variability, with broad implications for structural biology, drug discovery, and protein engineering.