Accurate Identification and Mechanistic Evaluation of Pathogenic Missense Variants with Rhapsody-2

Read the full article See related articles

Listed in

This article is not in any list yet, why not save it to one of your lists.
Log in to save this article

Abstract

Understanding the effects of missense mutations or single amino acid variants (SAVs) on protein function is crucial for elucidating the molecular basis of diseases/disorders and designing rational therapies. We introduce here Rhapsody-2, a machine learning tool for discriminating pathogenic and neutral SAVs, significantly expanding on a precursor limited by the availability of structural data. With the advent of AlphaFold2 as a powerful tool for structure prediction, Rhapsody-2 is trained on a significantly expanded dataset of 117,525 SAVs corresponding to 12,094 human proteins reported in the ClinVar database. Adopting a broad set of descriptors, including evolutionary, structural, dynamic, and energetics features in the training algorithm, Rhapsody-2 achieved an AUROC of 0.94 in 10-fold cross-validation when variants of the same protein are not simultaneously included in the training and testing sets. Benchmarking against a variety of testing datasets demonstrated the high performance of Rhapsody-2. While evolutionary descriptors play a dominant role in pathogenicity prediction, structural dynamics features provide a mechanistic interpretation of the predicted effects, pathogenic or neutral, of SAVs. Notably, residues involved in allosteric communication, and those distinguished by pronounced fluctuations in the high frequency modes of motion or subject to spatial constraints in soft modes usually give rise to pathogenicity when mutated. Overall, Rhapsody-2 provides an efficient and transparent tool for accurately predicting the pathogenicity of SAVs and unraveling the mechanistic basis of the observed behavior, thus advancing our understanding of genotype-to-phenotype relations.

Article activity feed