Benchmarking of variant pathogenicity prediction methods using a population genetics approach

Read the full article See related articles

Listed in

This article is not in any list yet, why not save it to one of your lists.
Log in to save this article

Abstract

Motivation

Variant pathogenicity predictors are essential for identifying new associations between genetic variants and rare diseases. However, despite the availability of numerous predictors, there is no clear consensus on which methods provide the most reliable results. The common practice of training, testing, and benchmarking these predictors using known variant sets from disease or mutagenesis studies raises concerns about ascertainment bias and data circularity.

Results

We benchmarked commonly used pathogenicity predictors using an orthogonal approach that does not rely on predefined “ground truth” datasets. By leveraging population-level genomic data from gnomAD and the Context-Adjusted Proportion of Singletons (CAPS) metric, we identified CADD and REVEL as the best-performing predictors for distinguishing extremely deleterious variants from moderately deleterious ones. REVEL demonstrated superior calibration. Additionally, we show that CAPS can serve as a meta-analysis tool for interpreting variant annotations and highlight biases in ClinVar-based predictor training.

Availability and Implementation

CAPS analysis and benchmarking results are available at https://github.com/mgudVCCRI/PopGenVariantFiltering

Contact

e.giannoulatou@victorchang.edu.au

Article activity feed