A machine-learning framework to characterize functional disease architectures and prioritize disease variants

Siliangyu Cheng
Artem Kim
Dhrithi Deshpande
Steven Gazal

Read the full article

Discuss this preprint

Start a discussion What are Sciety discussions?

Listed in

This article is not in any list yet, why not save it to one of your lists.

Abstract

Modeling disease effect sizes from genome-wide association studies (GWAS) is critical for both advancing our understanding of the functional architecture of human disease and providing informative priors that enhance the prioritization of potentially causal variants. Here, we introduce the variant-to-disease (V2D) framework, an approach that leverages machine-learning algorithms to model disease effect sizes from posterior estimates of effects obtained via genome-wide fine-mapping and functional annotations. We benchmarked the V2D framework using simulations and real data analysis, demonstrating that it provides reliable estimates of heritability ( h ² ) functional enrichment. By applying the V2D framework with linear trees to 15 UK Biobank traits, we identified non-linear relationships between constraint and regulatory annotations, highlighting constrained regulatory variants as the main functional component of disease functional architecture ( h ² enrichment = 17.3 ± 1.0x across 79 independent GWAS). By applying the V2D framework with neural networks, we developed GWAS prioritization scores, which were extremely enriched in common variant h ² (20.6 ± 0.7x for the top 1% scores), outperformed existing prioritization scores in the analysis of different GWAS datasets, were transportable to analyze gene expression and non-European datasets, and improved variant prioritization in GWAS fine-mapping studies.

Version published to 10.1101/2025.10.23.25338598 on medRxiv
Oct 24, 2025

Path-Probability Models Outperform Point-Estimate Scores for Noncoding GWAS Gene Prioritization

This article has 1 author:
1. Abduxoliq Ashuraliyev
This article has no evaluationsLatest version Dec 22, 2025
Global Evaluation of Congenital Heart Disease-Associated Non-Coding Variants

This article has 27 authors:
1. José Rodríguez-Martínez
2. Edwin Peña-Martínez
3. Shreya Sharma
4. Joshua Medina-Feliciano
5. Elise Root
6. Lois Parks
7. Marissa Granitto
8. Diego Pomales-Matos
9. Jean Messon- Bird
10. Adriana Barreiro-Rosario
11. Leandro Sanabria-Alberto
12. Alejandro Rivera-Madera
13. Jessica Rodríguez-Ríos
14. Rosalba Velázquez-Roig
15. Juan Figueroa- Rosado
16. Mackenzie Noon
17. Omer Donmez
18. Carmy Forney
19. Hayley Hesse
20. Katelyn Dunn
21. Xiaoting Chen
22. Matthew Hass
23. Lucinda Lawson
24. Matthew Weirauch
25. Leah Kottyan
26. Steven Reilly
27. Devesh Bhimsaria
This article has no evaluationsLatest version Jan 7, 2026
Personalized Disease Prediction Framework based on Genomic Variants and Disease Histories using Deep Embeddings and Alignment-based Process Conformance Checking

This article has 4 authors:
1. Daewoo Pak
2. Hyunwoo Jo
3. Seon Kim
4. Jongchan Kim
This article has no evaluationsLatest version Jan 20, 2026

Discuss this preprint

Listed in

Abstract

Article activity feed

Related articles

Path-Probability Models Outperform Point-Estimate Scores for Noncoding GWAS Gene Prioritization

Global Evaluation of Congenital Heart Disease-Associated Non-Coding Variants

Personalized Disease Prediction Framework based on Genomic Variants and Disease Histories using Deep Embeddings and Alignment-based Process Conformance Checking