Choice of phenotype scale is critical in biobank-based G×E tests

Manuela Costantino
Renée Fonseca
Zhengtong Liu
Zhenhong Huang
Sriram Sankararaman
Iain Mathieson
Andy Dahl

This article has been Reviewed by the following groups

Read the full article

Discuss this preprint

Start a discussion What are Sciety discussions?

Listed in

Evaluated articles (Arcadia Science)

Abstract

The importance of gene-environment interactions (G×E) for complex human traits is heavily debated. Recently, biobank-based GWAS have revealed many statistically significant G×E signals, though most lack clear evidence of biological significance. Here, we partly explain this discrepancy by showing that many G×E signals simplify to additive effects on a different phenotype scale, a classical concern that is currently underappreciated. Our results clearly distinguish G×Sex effects on height, which vanish on the log scale, from G×Sex effects on testosterone, where the log scale uncovers biologically meaningful female-specific effects. Across 32 phenotypes in UK Biobank, we find that scaling by a power transformation can explain 46% of PGS×Sex interactions, and that simple log transformation can explain 23%, with similar results for other environments. We also show that phenotype scale can substantially impact GWAS discovery and the construction and evaluation of polygenic scores. Finally, we provide a set of guidelines to consider and choose phenotype scale in modern genetic studies.

Arcadia Science
Mar 4, 2026

It is likely that scale transformation has the greatest impact on the tails of the distribution, which are not the focus of our study but may be important clinically (56,59,61).

I don't think this point should be understated. The tails of the distribution are what doctors use to diagnose disease or dysfunction. I am interested to see future work exploring how misspecified scales might disproportionally impact different genders or ancestries.

Read the original source
Arcadia Science
Mar 4, 2026

Although scale-dependence is a textbook concern, its dramatic impact on modern G×E studies is underappreciated.

This is a simple and powerful study that highlights a major issue in GxE inference. It is hard enough to quantify environment in humans and construct generalizable PGSs, but this study points out that even incorrectly specifying a scale--a seemingly minor thing---could lead to majorly incorrect results.

Read the original source
Arcadia Science
Mar 4, 2026

For example, we find complex interactions for biomarkers that are targeted by common drugs, like LDL, which may be eliminated by more sophisticated transformations (17).

Do you think finding specific functional forms that eliminate signal could actually help with disentangling complex biology? Like if we know GxE is eliminated by a sub-additive transformation, perhaps we could deduce that there is buffering going on?

Read the original source
Arcadia Science
Mar 4, 2026

However, this is explained by the fact that Pearson R2 is itself scale-dependent: while the Pearson R2 does indeed dramatically increase for females (13-fold, from 0.05% to 0.65%), it decreases in males (from 5.41% to 2.83%), constituting a net loss because Pearson R2 weights groups by phenotypic variance (Fig 3E,F). On the other hand, evaluating PGS R2 on the log scale gives the expected result, where the PGS constructed from the log scale performs better (from 1.02% to 1.74%). This demonstrates how the choice of phenotype scale affects not only PGS but also the relative prioritization of individuals in evaluation metrics.

This is a great insight that I feel will help the portability of polygenic scores across groups - sex for sure, but also different ancestry groups.

Read the original source
Arcadia Science
Mar 4, 2026

A. Results of the PGS×Sex analysis. Each gray line shows the p-values of the interaction term between a PGS and sex.

It is impossible to distinguish the other PGSs from each other, which are presumably clustered close together at the bottom. Perhaps include an inset figure or another panel for these?

Read the original source
Arcadia Science
Mar 4, 2026

one that was significant on all tested scales (age, Fig S1).

How do you explain this relationship? It seems strange that this PGSxE is scale-independent.

Read the original source
Arcadia Science
Mar 4, 2026

We also tested the rank-inverse normal transformation (RINT), which scales the phenotype to have approximately Gaussian quantiles. RINT did not depend on the observed scale, as expected, and always gave modest inflation for PGS×E effects (FPR=0.073), even when the additive scale is observed.

This is interesting, perhaps consider showing this in Figure 1?

Read the original source
Arcadia Science
Mar 4, 2026

Figure 1.

This figure might be clearer if A-C had the same y axis and D-F had the same y axis.

Read the original source
Arcadia Science
Mar 4, 2026

We then profile PGSxE p-values across λ from -1 to 2, asking if some λ recovers a truly additive scale.

The patterns shown in Figure 1 are very distinct for situations with no GxE and with GxE. It doesn't seem like it would be a large amount of work for researchers in the future to produce a similar plot to justify their choice of scale transformation (or lack thereof) in their work.

Read the original source
Version published to 10.64898/2026.01.20.694695 on bioRxiv
Jan 21, 2026

Network-based analysis of genome-wide biobank data boosts discovery of genetic associations in psoriasis

This article has 5 authors:
1. Giann Karlo Aguirre-Samboní
2. Gwenaëlle Lemoine
3. Julio Molineros
4. Florian Massip
5. Chloé-Agathe Azencott
This article has no evaluationsLatest version Mar 16, 2026
XMR: A cross-population Mendelian randomization method for causal inference using genome-wide summary statistics

This article has 5 authors:
1. Can Yang
2. Xinrui Huang
3. Zitong Chao
4. Zhiwei Wang
5. Xianghong Hu
This article has no evaluationsLatest version Mar 20, 2026
Benchmark of open-access star-allele callers to accurately assess haplotypes and phenotypes in pharmacogenetic studies

This article has 3 authors:
1. Marc Gros La Faige
2. Emmanuelle Génin
3. Anthony Herzig
This article has no evaluationsLatest version Feb 18, 2026

This article has been Reviewed by the following groups

Discuss this preprint

Listed in

Abstract

Article activity feed

Related articles

Network-based analysis of genome-wide biobank data boosts discovery of genetic associations in psoriasis

XMR: A cross-population Mendelian randomization method for causal inference using genome-wide summary statistics

Benchmark of open-access star-allele callers to accurately assess haplotypes and phenotypes in pharmacogenetic studies