Constructing the ensemble of representative structures for a protein via neural-surrogate-guided MSA recombination
Discuss this preprint
Start a discussion What are Sciety discussions?Listed in
This article is not in any list yet, why not save it to one of your lists.Abstract
Structural dynamics is essential for the functional and mechanistic illustration of proteins. Previous research attempted to generate diversified protein structures by utilizing the multiple sequence alignment (MSA), but failed to provide physically relevant representative conformations without state annotations. In this work, we propose a framework named ProCEDiS to generate a compact ensemble of representative conformations for the target protein without prior knowledge. Adopting a neural surrogate to assist the exploration of MSA recombination and integrating with AlphaFold2 to model structures, this method can automatically find high-quality, mutually dissimilar conformations for the target sequence. Parallel short-timescale molecular dynamics (MD) simulations on these structure seeds enable quick while crude free energy estimation, from which physically plausible representative states could be identified. In the benchmark on four protein systems, the ProCEDiS + MD pipeline is capable of providing valuable structural dynamics information within acceptable running time.