Bias in genome-wide association test statistics due to omitted interactions

Burak Yelmen
Merve Nur Güler
Estonian Biobank Research Team
Tõnu Kollo
Märt Möls
Guillaume Charpiat
Flora Jay

Read the full article

Discuss this preprint

Start a discussion What are Sciety discussions?

Listed in

This article is not in any list yet, why not save it to one of your lists.

Abstract

Over the past two decades, genome-wide association studies (GWAS) enabled the discovery of thousands of variants associated with many complex human traits. However, conventional GWAS are still widely performed with linear models with the assumption that the genetic effects are predominantly additive. In this work, we investigate the test statistic behavior when linear models are used to obtain significant genotype-phenotype associations without accounting for epistasis. We first algebraically derive mean and variance shift in the null statistic due to the omitted interaction term, and define the boundary between conservative (i.e., deflated statistic tail) and anti-conservative (i.e., inflated statistic tail) regimes for the common GWAS significance threshold. We then perform phenotype simulation analyses using the Estonian Biobank genotypes and validate the mathematical model. We demonstrate that the anticonservative regime is plausible under realistic parameter settings and models omitting interaction terms can produce spurious significance. Our findings suggest caution when interpreting statistically significant signals reported in the literature based on linear models, especially for large-scale GWAS.

Version published to 10.1101/2025.11.21.689603 on bioRxiv
Nov 22, 2025

Causal effect heterogeneity estimation using summary statistics

This article has 8 authors:
1. Xingjie Shi
2. Yadong Yang
3. Minxi Bai
4. Jiacheng Miao
5. Stephen Dorn
6. Jonathan Haugstad
7. Jin Liu
8. Qiongshi Lu
This article has no evaluationsLatest version Jan 14, 2026
Application of longitudinal follow-up data increases power in the identification of genetic loci for type 2 diabetes

This article has 1 author:
1. Seong Beom Cho
This article has no evaluationsLatest version Dec 18, 2025
An Advanced Entropy Approach for Minimizing False Discoveries in Imputation-Based Association Analyses

This article has 4 authors:
1. Zhihui Zhang
2. Dakai Zhu
3. Xiangjun Xiao
4. Christopher I. Amos
This article has no evaluationsLatest version Dec 17, 2025

Discuss this preprint

Listed in

Abstract

Article activity feed

Related articles

Causal effect heterogeneity estimation using summary statistics

Application of longitudinal follow-up data increases power in the identification of genetic loci for type 2 diabetes

An Advanced Entropy Approach for Minimizing False Discoveries in Imputation-Based Association Analyses