Simplifying causal gene identification in GWAS loci

Marijn Schipper
Jacob Ulirsch
Danielle Posthuma
Stephan Ripke
Karl Heilbron

Read the full article

Discuss this preprint

Start a discussion What are Sciety discussions?

Listed in

This article is not in any list yet, why not save it to one of your lists.

Abstract

Genome-wide association studies (GWAS) help to identify disease-linked genetic variants, but pinpointing the most likely causal genes in GWAS loci remains challenging. Existing GWAS gene prioritization tools are powerful but often use complex black box models trained on datasets containing unaddressed biases. Here, we use a data-driven approach to construct a truth set of causal genes in 406 GWAS loci. We train a gene prioritization tool, CALDERA, that uses a simple logistic regression model with L1 regularization and corrects for potential confounders. Using three independent benchmarking datasets of resolved GWAS loci, we compare the performance of CALDERA with three other methods (FLAMES, L2G, and cS2G). CALDERA outperforms all these methods in two out of three datasets and ranks second in the remaining dataset. We demonstrate that CALDERA prioritizes genes with expected properties, such as mutation intolerance (OR = 1.751 for pLI > 90%, P = 8.45x10 ^-3 ). Overall, CALDERA provides a powerful solution for prioritizing potentially causal genes in GWAS loci and may help identify novel genetics-driven drug targets.

Version published to 10.1101/2024.07.26.24311057 on medRxiv
Jul 29, 2024

Genome-wide association study highlights novel loci and hiding heritability for amyotrophic lateral sclerosis in 740,868 individuals

This article has 9 authors:
1. Fengzhen Liu
2. Shan Gao
3. Ping Zhu
4. Shiyang Wu
5. Yijie He
6. Shuyuan Hu
7. Kun Wang
8. Xunming Ji
9. Guiyou Liu
This article has no evaluationsLatest version Mar 24, 2026
Network-based analysis of genome-wide biobank data boosts discovery of genetic associations in psoriasis

This article has 5 authors:
1. Giann Karlo Aguirre-Samboní
2. Gwenaëlle Lemoine
3. Julio Molineros
4. Florian Massip
5. Chloé-Agathe Azencott
This article has no evaluationsLatest version Mar 16, 2026
XMR: A cross-population Mendelian randomization method for causal inference using genome-wide summary statistics

This article has 5 authors:
1. Can Yang
2. Xinrui Huang
3. Zitong Chao
4. Zhiwei Wang
5. Xianghong Hu
This article has no evaluationsLatest version Mar 20, 2026

Discuss this preprint

Listed in

Abstract

Article activity feed

Related articles

Genome-wide association study highlights novel loci and hiding heritability for amyotrophic lateral sclerosis in 740,868 individuals

Network-based analysis of genome-wide biobank data boosts discovery of genetic associations in psoriasis

XMR: A cross-population Mendelian randomization method for causal inference using genome-wide summary statistics