Beyond guilty by association at scale: searching for causal variants on the basis of genome-wide summary statistics
Discuss this preprint
Start a discussion What are Sciety discussions?Listed in
This article is not in any list yet, why not save it to one of your lists.Abstract
Understanding the causal genetic architecture of complex phenotypes will fuel future research into disease mechanisms and potential therapies. Here, we illustrate the power of a novel framework: it detects, starting from summary statistics, and across the entire genome, sets of variants that carry non-redundant information on the phenotypes and are therefore more likely to be causal in a biological sense. The approach, implemented in open-source software, is also computationally efficient, requiring less than 15 minutes on a single CPU to perform genome-wide analysis. Through extensive genome-wide simulation studies, we show that the method can substantially outperform existing methods in false discovery rate control, statistical power and various fine-mapping criteria. In applications to a meta-analysis of ten large-scale genetic studies of Alzheimer’s disease (AD), we identified 82 loci associated with AD, including 37 additional loci missed by conventional GWAS pipeline. Massively parallel reporter assays and CRISPR-Cas9 experiments have confirmed the functionality of the putative causal variants our method points to. Finally, we retrospectively analyzed summary statistics from 67 large-scale GWAS for a variety of phenotypes. Results reveal the method’s capacity to robustly discover additional loci for polygenic traits and pinpoint potential causal variants underpinning each locus beyond conventional GWAS pipeline, contributing to a deeper understanding of complex genetic architectures in post-GWAS analyses.