Machine Learning Driven Simulations of SARS-CoV-2 Fitness Landscape
Listed in
This article is not in any list yet, why not save it to one of your lists.Abstract
SARS-CoV-2 infection is mediated by interactions between the receptor binding domain (RBD) of viral spike proteins and host cell angiotensin converting enzyme 2 (ACE2) receptors. Mutations in the spike protein are the primary cause for neutralizing antibody escape leading to breakthrough infections. We characterize the fitness landscape underpinning future variants of concern by combining supervised machine learning and Markov Chain Monte Carlo. Leveraging deep mutational scanning (DMS) data characterizing the binding affinity between RBD mutants to the ACE2 receptor, we predict variants of concern not seen in the training data and sample statistics of the fitness landscape. These simulations provide insight into the relationship between RBD sequence elements and offer a new perspective on utilizing DMS to predict emerging viral strains.