Interactions with polygenic background impact quantitative traits in the UK Biobank
This article has been Reviewed by the following groups
Discuss this preprint
Start a discussion What are Sciety discussions?Listed in
- Evaluated articles (Arcadia Science)
Abstract
Association studies have linked many genetic variants to a variety of phenotypes but under-standing the biological mechanisms underlying these signals remains a major challenge. Since genes operate within complex networks, statistical interactions between genetic mutations that reflect biological pathways are expected to exist. However, their discovery has been hampered by the vast search space of variant combinations and the multiplicatively small expected effect sizes of interactions. To increase power, we created a test for interaction between single-nucleotide polymorphisms (SNPs) and groups of other variants with a direct effect on a phenotype aggregated in a polygenic score (PGS) which can be performed for any quantitative trait. In realistic simulations, this method avoids false positives and is well powered to find interaction networks. We apply it to 97 quantitative phenotypes in European samples in the UK Biobank and identify 144 independent interactions affecting 52 different traits, including important disease risk variants at genes such as APOE , FTO or TCF7L2 . We develop approaches to refine identified signals and detect 38 pairwise interactions between SNPs. These include known interactions between ABO , FUT2 and TREH affecting alkaline phosphatase levels which are shown to be part of a larger network including PIGC and FUT6 , as well as an interaction for eosinophil levels between IL33 and ALOX15 , two genes whose functional interaction has recently been implicated in asthma. Finally, we propose a method to partition PGSs according to the binding sites of more than 1100 transcription factors using the HOCOMOCO motif database and test for interactions involving functionally partitioned scores. We identify 12 interactions affecting eight traits, two of which directly reflect known regulatory relationships such as that between TCF7L2 (a key regulator of glucose metabolism) and the transcription factor KDM2A , which are known to interact functionally within the Wnt signalling pathway, affecting glycated haemoglobin levels. This work significantly extends the set of known epistatic effects for human phenotypes and shows how statistical interactions can reflect biological interdependencies between genes.
Article activity feed
-
A second limitation is that we cannot currently test for interactions in cis because this risks false positives (which led us to build leave-one-chromosome-out PGSs for interaction testing).
Since you are dealing with a homogeneous population and regressing out ancestry components, this may not be as large of an issue as you suspect. It would be nice to see some simulations of the false positive rate you expect when doing this. I imagine there are also a lot of important, true interactions within-chromosome.
-
We therefore explored an initial approach to divide each trait’s PGS into functionally defined components for downstream testing.
Another idea for a way to partition PGSs is by the sign of the PGS SNP effect size. Perhaps if a SNP significantly interacts with negative-effect-size PGS SNPs (but not positive), or vice-versa, this could help with placing causal SNPs/genes relative to other known players in a pathway or distinguishing between a pathways' functions in contrasting diseases. However this may not work since networks are complex and involve maybe activating and inhibitory interactions.
-
Having identified a considerable number of independent SNP×PGS interactions, we then leveraged these signals to find SNP×SNP interactions by running a GWAS of pairwise interaction for each SNP×PGS interaction hit. As this required running only a few GWASs for each of the 52 phenotypes for which we had identified SNP×PGS interactions, the number of statistical tests performed for each phenotype was of the same order of magnitude as a standard GWAS, therefore incurring only modest computational cost and requiring the usual Bonferroni correction for multiple testing.
Perhaps you could reduce complexity even further by building a PGS per chromosome, testing all chromosome PGS x chromosome PGS interactions, and then doing a subsequent SNPxSNP GWAS between the SNPs on the implicated chromosomes. I'm not sure how many SNPs are on each …
Having identified a considerable number of independent SNP×PGS interactions, we then leveraged these signals to find SNP×SNP interactions by running a GWAS of pairwise interaction for each SNP×PGS interaction hit. As this required running only a few GWASs for each of the 52 phenotypes for which we had identified SNP×PGS interactions, the number of statistical tests performed for each phenotype was of the same order of magnitude as a standard GWAS, therefore incurring only modest computational cost and requiring the usual Bonferroni correction for multiple testing.
Perhaps you could reduce complexity even further by building a PGS per chromosome, testing all chromosome PGS x chromosome PGS interactions, and then doing a subsequent SNPxSNP GWAS between the SNPs on the implicated chromosomes. I'm not sure how many SNPs are on each chromosome, but this could potentially reduce computational needs and the multiple testing burden since the number of epistatic interactions per trait is very small.
-
We assume that relevant covariates (especially age, sex and a measure of ancestry to control for population structure, for which we use Ancestry Components [29]) have been regressed out from the phenotype in advance, which simplifies model fitting in practice.
If you didn't regress out ancestry, could the PGS term already sufficiently account for population structure? It may not be necessary to remove ancestry components if the PGS term absorbs polygenic background and therefore ancestry, allowing you to use a larger and more diverse population to estimate the SNPxPGS term. It's unclear to me whether it would fully account for this, but may be something to try.
-