Structure-informed direct coupling analysis improves protein mutational landscape predictions
Discuss this preprint
Start a discussion What are Sciety discussions?Listed in
This article is not in any list yet, why not save it to one of your lists.Abstract
Direct Coupling Analysis has been instrumental over the past decade in leveraging evolutionary information and advancing our understanding of biomolecular structure and function. Here, we introduce sparse extensions of this method that explicitly incorporate structural information. StructureDCA focuses on physically relevant interactions by selectively retaining couplings between residues in spatial contact, and StructureDCA[RSA] additionally incorporates per-residue relative solvent accessibility. These models outperform state-of-the-art approaches in describing mutational landscapes, as they more effectively integrate structural context. Moreover, their sparse formulation enables orders-of-magnitude improvements in computational efficiency while preserving interpretability, providing a powerful framework for gaining mechanistic insights into mutation effects and advancing protein design. The StructureDCA models are available as a user-friendly Python package via the PyPI repository. The source code is freely accessible at https://github.com/3BioCompBio/StructureDCA , which also includes a Colab Notebook interface.