Deep generative modeling of temperature-dependent structural ensembles of proteins
Listed in
This article is not in any list yet, why not save it to one of your lists.Abstract
Deep learning has revolutionized protein structure prediction, but capturing conformational ensembles and structural variability remains an open challenge. While molecular dynamics (MD) is the foundation method for simulating biomolecular dynamics, it is computationally expensive. Recently, deep learning models trained on MD have made progress in generating structural ensembles at reduced cost. However, they remain limited in modeling atomistic details and, crucially, incorporating the effect of environmental factors. Here, we present aSAM (atomistic structural autoencoder model), a latent diffusion model trained on MD to generate heavy atom protein ensembles. Unlike most methods, aSAM models atoms in a latent space, greatly facilitating accurate sampling of side chain and backbone torsion angle distributions. Additionally, we extended aSAM into the first reported transferable generator conditioned on temperature, named aSAMt. Trained on the large and open mdCATH dataset, aSAMt captures temperature-dependent ensemble properties and demonstrates generalization beyond training temperatures. By comparing aSAMt ensembles to long MD simulations of fast folding proteins, we find that high-temperature training enhances the ability of deep generators to explore energy landscapes. Finally, we also show that our MD-based aSAMt can already capture experimentally observed thermal behavior of proteins. Our work is a step towards generalizable ensemble generation to complement physics- based approaches.