Optimizing phenotype scale improves genetic analyses in large-scale biobanks

Zhenhong Huang
Manuela Costantino
Andy Dahl

Read the full article

Discuss this preprint

Start a discussion What are Sciety discussions?

Listed in

This article is not in any list yet, why not save it to one of your lists.

Abstract

Large-scale biobanks have enabled increasingly complicated genetic analyses across thousands of phenotypes. However, studies rarely consider the appropriate phenotype measurement scale, a problem that can drastically affect inferences on genetic architecture. Here, we introduce SIQReg, a practical solution to this classical problem, which learns a data-driven phenotype scale by minimizing heterogeneity across phenotype quantiles. Applied to complex traits in UK Biobank, SIQReg rejects the default scale for 24/25 traits. Generally, SIQReg scales lie between default and logarithmic, indicating that default-scale traits are neither purely additive nor purely multiplicative. We show that SIQReg improves both non-additive and additive genetic analyses. SIQReg eliminates most non-additive genetic signals (such as 97% of vQTL and 76% of quantile-dependent TWAS genes), indicating they may be statistical artifacts, while preserving biologically plausible non-additive signals. Simultaneously, SIQReg improves power to detect additive signals, increasing GWAS loci, TWAS genes, and PGS prediction accuracy by 11%, 13%, and 10%, respectively, and identifies 50% more high-risk individuals. These gains replicate across ancestry groups. Our results establish SIQReg as a principled approach to phenotype scale transformation that improves genetic analyses of complex traits.

Version published to 10.64898/2026.05.04.722531 on bioRxiv
May 7, 2026

Structured Pooling Improves Detection of Rare Regulatory Mutations in Population-Scale Reporter Assays

This article has 9 authors:
1. Katherine Dura
2. Keith Siklenka
3. Kari Strause
4. Shauna Morrow
5. Chuangchuang Zhang
6. Alejandro Barrera
7. Andrew S. Allen
8. Timothy E. Reddy
9. William H. Majoros
This article has no evaluationsLatest version Mar 31, 2026
The Human Pleiotropic Map of GWAS Associations and Therapeutic Implications

This article has 31 authors:
1. Yakov A. Tsepilov
2. Daniel Suveges
3. Daniel Considine
4. Szymon Szyszkowski
5. Xiangyu Jack Ge
6. Irene López Santiago
7. Polina Rusina
8. Tobi Alegbe
9. Vivien W. Ho
10. Kirill Tsukanov
11. Juan María Roldán-Romero
12. Ines A. Smit
13. Helena Cornu
14. Laura Harris
15. Kaur Alasoo
16. Alexander Predeus
17. Samuel Lessard
18. Clément Chatelain
19. Shameer Khader
20. Stephanie Yang
21. Anna O’Carroll
22. Yury S. Aulchenko
23. Daniel Seaton
24. Annalisa Buniello
25. Ewan Birney
26. Eric B. Fauman
27. Mark I. McCarthy
28. David G. Hulcoop
29. Gosia Trynka
30. Ellen M. McDonagh
31. David Ochoa
This article has no evaluationsLatest version May 1, 2026
From GWAS to Causal Inference: A Beginner’s Guide to Mendelian Randomization with Code Examples

This article has 7 authors:
1. Ahmed M Salih
2. Roman Roy
3. Yuhe Wang
4. Irene Treccani
5. Andre Altmann
6. Zahra Raisi-Estabragh
7. Gloria Menegaz
This article has no evaluationsLatest version Apr 9, 2026

Discuss this preprint

Listed in

Abstract

Article activity feed

Related articles

Structured Pooling Improves Detection of Rare Regulatory Mutations in Population-Scale Reporter Assays

The Human Pleiotropic Map of GWAS Associations and Therapeutic Implications

From GWAS to Causal Inference: A Beginner’s Guide to Mendelian Randomization with Code Examples