Conserved missense variant pathogenicity and correlated phenotypes across paralogous genes

Read the full article See related articles

Listed in

This article is not in any list yet, why not save it to one of your lists.
Log in to save this article

Abstract

Background The majority of missense variants in clinical genetic tests are classified as variants of uncertain significance. Prior research has shown that the deleterious effects and the subsequent molecular consequence of variants are often conserved among paralogous protein sequences within a gene family. Here, we systematically quantified on an exome-wide scale if the existence of pathogenic variants in paralogous genes at a conserved position could serve as evidence for the pathogenicity of a new variant. For the gene family of voltage-gated sodium channels where variants and expert-curated clinical phenotypes were available, we also assessed whether phenotype patterns of multiple disorders for each gene were also conserved across variant positions within the gene family. Methods We developed a framework that assesses the presence of pathogenic missense variants located in conserved residues across paralogous genes. We systematically mapped 2.5 million pathogenic and general population variants from the ClinVar, HGMD, and gnomAD databases onto a total of 9,990 genes and aligned them by gene families. We evaluated the quantity of classifiable amino acids by utilizing pathogenic variants identified in databases alone and then compared this assessment to the inclusion of paralogous pathogenic variants. We validated and quantified the evidence of conserved pathogenic paralogous variants in variant pathogenicity classification. Results Considering conserved pathogenic variants in paralogous genes, increased the number of classifiable variants 2.8-fold across the exome, compared to pathogenic variants in the gene of interest alone. The presence of a pathogenic variant in a paralogous gene is associated with a positive likelihood ratio of 8.32 for variant pathogenicity. The likelihood ratio was gene family-specific. Across ten genes encoding voltage-gated sodium channels and 22 expert-curated disorders, we identified cross-paralog correlated phenotypes based on 3D structure spatial position. For example, the established loss-of-function disorders SCN1A -associated Dravet syndrome, SCN2A- associated autism, SCN5A -associated Brugarda Syndrome, and SCN8A- associated neurodevelopmental disorder without seizures were correlated in their spatial variant position on structure. Finally, we show that phenotype integration in paralog variant selection improves variant classification. Conclusion Our results show that paralogous variants, in particular with phenotype information can enhance our understanding of variant effects.

Article activity feed