Elucidating Protein Dynamics through the Optimal Annealing of Variational Autoencoders

Read the full article See related articles

Listed in

This article is not in any list yet, why not save it to one of your lists.
Log in to save this article

Abstract

Proteins traverse intricate conformational landscapes, with transitions and long-lived states that hold the key to their biological function. Yet, unraveling these dynamics remains a formidable challenge. An emerging approach has been to train the conformational ensemble via deep Variational autoencoders (VAEs) in a bid to machine learn the underlying reduced dimensional representation. However, training VAEs typically involves a fixed β value of 1, where β acts as the crucial weighing factor between the reconstruction and regularization terms. This static setup can often lead to poste-rior collapse, which significantly hinders the model’s ability to capture complex protein dynamics accurately. To mitigate this issue, annealing the β parameter offers a potential alternative. However, this approach frequently falls short in fully addressing the problem, majorly due to arbitrary choice of upper bound of β and annealing schedule. In this work, we introduce an innovative approach for selecting the β parameter by utilizing the Fraction of Variance Explained (FVE) score to identify its optimal value. We demonstrate that training annealed VAEs at their optimum β in a single cycle consistently outperformed their non-annealed counterparts, as evident from their higher Variational Approach for Markov Processes-2 and Generalized Matrix Rayleigh Quotient scores and distinct free energy surface minima on both folded and intrinsically disordered proteins. The improved latent space representations significantly improve state space discretization, thereby refining Markov State Models and providing more accurate insights into conformational landscapes as reflected in distinct contact maps. These findings not only underscore the potential of annealed VAEs in resolving complex conformational spaces but also highlight the critical interplay between annealing schedules and latent space structures.

Article activity feed