A machine-learning framework to characterize functional disease architectures and prioritize disease variants

Read the full article See related articles

Discuss this preprint

Start a discussion What are Sciety discussions?

Listed in

This article is not in any list yet, why not save it to one of your lists.
Log in to save this article

Abstract

Modeling disease effect sizes from genome-wide association studies (GWAS) is critical for both advancing our understanding of the functional architecture of human disease and providing informative priors that enhance the prioritization of potentially causal variants. Here, we introduce the variant-to-disease (V2D) framework, an approach that leverages machine-learning algorithms to model disease effect sizes from posterior estimates of effects obtained via genome-wide fine-mapping and functional annotations. We benchmarked the V2D framework using simulations and real data analysis, demonstrating that it provides reliable estimates of heritability ( h 2 ) functional enrichment. By applying the V2D framework with linear trees to 15 UK Biobank traits, we identified non-linear relationships between constraint and regulatory annotations, highlighting constrained regulatory variants as the main functional component of disease functional architecture ( h 2 enrichment = 17.3 ± 1.0x across 79 independent GWAS). By applying the V2D framework with neural networks, we developed GWAS prioritization scores, which were extremely enriched in common variant h 2 (20.6 ± 0.7x for the top 1% scores), outperformed existing prioritization scores in the analysis of different GWAS datasets, were transportable to analyze gene expression and non-European datasets, and improved variant prioritization in GWAS fine-mapping studies.

Article activity feed