Choice of phenotype scale is critical in biobank-based G×E tests
This article has been Reviewed by the following groups
Discuss this preprint
Start a discussion What are Sciety discussions?Listed in
- Evaluated articles (Arcadia Science)
Abstract
The importance of gene-environment interactions (G×E) for complex human traits is heavily debated. Recently, biobank-based GWAS have revealed many statistically significant G×E signals, though most lack clear evidence of biological significance. Here, we partly explain this discrepancy by showing that many G×E signals simplify to additive effects on a different phenotype scale, a classical concern that is currently underappreciated. Our results clearly distinguish G×Sex effects on height, which vanish on the log scale, from G×Sex effects on testosterone, where the log scale uncovers biologically meaningful female-specific effects. Across 32 phenotypes in UK Biobank, we find that scaling by a power transformation can explain 46% of PGS×Sex interactions, and that simple log transformation can explain 23%, with similar results for other environments. We also show that phenotype scale can substantially impact GWAS discovery and the construction and evaluation of polygenic scores. Finally, we provide a set of guidelines to consider and choose phenotype scale in modern genetic studies.
Article activity feed
-
It is likely that scale transformation has the greatest impact on the tails of the distribution, which are not the focus of our study but may be important clinically (56,59,61).
I don't think this point should be understated. The tails of the distribution are what doctors use to diagnose disease or dysfunction. I am interested to see future work exploring how misspecified scales might disproportionally impact different genders or ancestries.
-
Although scale-dependence is a textbook concern, its dramatic impact on modern G×E studies is underappreciated.
This is a simple and powerful study that highlights a major issue in GxE inference. It is hard enough to quantify environment in humans and construct generalizable PGSs, but this study points out that even incorrectly specifying a scale--a seemingly minor thing---could lead to majorly incorrect results.
-
For example, we find complex interactions for biomarkers that are targeted by common drugs, like LDL, which may be eliminated by more sophisticated transformations (17).
Do you think finding specific functional forms that eliminate signal could actually help with disentangling complex biology? Like if we know GxE is eliminated by a sub-additive transformation, perhaps we could deduce that there is buffering going on?
-
However, this is explained by the fact that Pearson R2 is itself scale-dependent: while the Pearson R2 does indeed dramatically increase for females (13-fold, from 0.05% to 0.65%), it decreases in males (from 5.41% to 2.83%), constituting a net loss because Pearson R2 weights groups by phenotypic variance (Fig 3E,F). On the other hand, evaluating PGS R2 on the log scale gives the expected result, where the PGS constructed from the log scale performs better (from 1.02% to 1.74%). This demonstrates how the choice of phenotype scale affects not only PGS but also the relative prioritization of individuals in evaluation metrics.
This is a great insight that I feel will help the portability of polygenic scores across groups - sex for sure, but also different ancestry groups.
-
A. Results of the PGS×Sex analysis. Each gray line shows the p-values of the interaction term between a PGS and sex.
It is impossible to distinguish the other PGSs from each other, which are presumably clustered close together at the bottom. Perhaps include an inset figure or another panel for these?
-
one that was significant on all tested scales (age, Fig S1).
How do you explain this relationship? It seems strange that this PGSxE is scale-independent.
-
We also tested the rank-inverse normal transformation (RINT), which scales the phenotype to have approximately Gaussian quantiles. RINT did not depend on the observed scale, as expected, and always gave modest inflation for PGS×E effects (FPR=0.073), even when the additive scale is observed.
This is interesting, perhaps consider showing this in Figure 1?
-
Figure 1.
This figure might be clearer if A-C had the same y axis and D-F had the same y axis.
-
We then profile PGSxE p-values across λ from -1 to 2, asking if some λ recovers a truly additive scale.
The patterns shown in Figure 1 are very distinct for situations with no GxE and with GxE. It doesn't seem like it would be a large amount of work for researchers in the future to produce a similar plot to justify their choice of scale transformation (or lack thereof) in their work.
-