Human Proteome-wide Mechanistic Interpretation of Missense Variants through Protein Feature Enrichment Score

Read the full article See related articles

Discuss this preprint

Start a discussion What are Sciety discussions?

Listed in

This article is not in any list yet, why not save it to one of your lists.
Log in to save this article

Abstract

Missense variant interpretation remains a central challenge in clinical and medical genetics, with most observed variants being variants of uncertain significance (VUS). Computational variant effect predictors can achieve high pathogenicity classification performance, but without revealing the underlying mechanism and a translatable interpretation. Here we present the Protein Feature Enrichment Score (PFES), which quantifies the molecular context of missense variants through statistical enrichment of 103 protein structural, functional, and physicochemical features across 85,321 pathogenic and 130,719 control variants spanning 20 protein functional classes. We show that the protein feature (PF) enrichment patterns of variants are conserved within functional classes and vary substantially across classes, both in magnitude and directions depending on functional context. PFES not only partitions variants into PF-Enriched (pathogenic-like) , PF-Neutral, and PF-Depleted (benign-like) categories but also provides a mechanistic interpretation by decomposing the score into subscores from biologically interpretable protein feature attributes. We demonstrate that PFES shows a high concordance with VUS reclassification and prioritization: across 596 genes, pathogenicity-leaning VUS-high variants were seven-fold enriched in PF-Enriched variants. PFES decomposition further revealed that loss-of-function and gain-of-function variants are distinguished by disproportionate enrichment of protein-protein interaction features in the latter. We computed PFES across 223 million possible missense variants (17.7% PF-Enriched) and built a publicly available resource that addresses not just whether a variant is pathogenic, but which protein characteristics are disrupted. Proteome-wide application across 20,153 genes prioritizes established rare disease genes and nominates therapeutically amenable targets whose pathogenic variation is driven by interpretable structural and functional protein feature disruption.

One Sentence Summary

PFES is a proteome-wide resource to quantify the protein context of missense variants, enabling mechanistically transparent variant interpretation.

Article activity feed