Learning sequence to predict gain- or loss-of-function variants

Read the full article See related articles

Listed in

This article is not in any list yet, why not save it to one of your lists.
Log in to save this article

Abstract

A clear understanding of mutational effects can advance genetics and biomedical research by providing valuable insights into gene functions, disease mechanisms, and therapeutic approaches. However, methods to determine the pathogenicity of genetic variants are limited by the absence of information on the direction of mutational effects. Here, we present ClearVariant, a deep learning system to classify pathogenic variants into gain- or loss-of-function, achieving state-of-the-art performance validated with data from ClinVar and Human Gene Mutation Database (HGMD). The model contains protein language models (PLMs) for training mutated sequences alongside their reference counterparts, showing similar predicted outcomes when a residue changed to another amino acid belonging to the same property group. We evaluated its ability to learn the protein language by observing high attention scores on coevolutionary relationships. To support advancements in biomedicine, we provide a database of pathogenic human missense variants labelled with their predicted mutational effects.

Article activity feed