Why variant effect predictors and multiplexed assays agree and disagree

Benjamin J. Livesey
Joseph A. Marsh

Read the full article

Discuss this preprint

Start a discussion What are Sciety discussions?

Listed in

This article is not in any list yet, why not save it to one of your lists.

Abstract

Computational variant effect predictors (VEPs) and multiplexed assays of variant effect (MAVEs) are two key tools for assessing the functional consequences of genetic variants. While their outputs are often concordant, there are also many differences. Here, we analyse missense MAVE data from 37 different human proteins, comparing them to five state-of-the-art VEPs in order to quantify and explain their points of agreement and disagreement. We find that discordance is not random but reflects fundamental differences in how each method infers functional impact. VEPs, which rely heavily on sequence conservation and basic structural features, tend to overcall pathogenicity at buried and hydrophobic residues, while underestimating impact in disordered regions and on charged surface residues. MAVEs, by contrast, capture context-specific mechanisms more accurately, but can miss pathogenic variants when the assay fails to reflect disease biology, or be subject to high levels of experimental noise. By comparing both global patterns and specific clinically relevant variants, we show how protein features, assay design, and variant type shape predictive discordance. Our findings provide a framework for interpreting when and why VEPs and MAVEs diverge and point toward strategies for improving variant interpretation through integrated, mechanism-aware approaches.

Version published to 10.1101/2025.07.31.667868 on bioRxiv
Aug 1, 2025

Benchmarking RNA-seq Tools for Real-World Diagnostic Applications

This article has 15 authors:
1. Sarah Silverstein
2. Kaushik Ganapathy
3. Sandra Donkervoort
4. Veronique Bolduc
5. Ying Hu
6. Justin Moy
7. Prech Uapinyoying
8. Svetlana Gorokhova
9. Vijay Ganesh
10. Ben Weisburd
11. Rotem OrBach
12. A. Reghan Foley
13. Pejman Mohammadi
14. David Adams
15. Carsten Bonnemann
This article has no evaluationsLatest version Jan 29, 2026
Protein Language Models Rescue Variant Pathogenicity Prediction in Intrinsically Disordered Regions Through Synergistic Integration with Structure-Based Methods

This article has 1 author:
1. Hayden Farquhar
This article has no evaluationsLatest version Feb 4, 2026
Path-Probability Models Outperform Point-Estimate Scores for Noncoding GWAS Gene Prioritization

This article has 1 author:
1. Abduxoliq Ashuraliyev
This article has no evaluationsLatest version Dec 22, 2025

Discuss this preprint

Listed in

Abstract

Article activity feed

Related articles

Benchmarking RNA-seq Tools for Real-World Diagnostic Applications

Protein Language Models Rescue Variant Pathogenicity Prediction in Intrinsically Disordered Regions Through Synergistic Integration with Structure-Based Methods

Path-Probability Models Outperform Point-Estimate Scores for Noncoding GWAS Gene Prioritization