The missing link between genetic association and regulatory function

Curation statements for this article:
  • Curated by eLife

    eLife logo

    eLife assessment

    A commonly held hypothesis about how genetic variants predispose to common diseases and other human traits is that variants have phenotypic effects by altering transcript accumulation. The authors question this view by showing some evidence for shared genetic control of transcript abundance for genes believed to be involved in the traits, and for the traits themselves.

This article has been Reviewed by the following groups

Read the full article See related articles

Abstract

The genetic basis of most traits is highly polygenic and dominated by non-coding alleles. It is widely assumed that such alleles exert small regulatory effects on the expression of cis -linked genes. However, despite the availability of gene expression and epigenomic datasets, few variant-to-gene links have emerged. It is unclear whether these sparse results are due to limitations in available data and methods, or to deficiencies in the underlying assumed model. To better distinguish between these possibilities, we identified 220 gene–trait pairs in which protein-coding variants influence a complex trait or its Mendelian cognate. Despite the presence of expression quantitative trait loci near most GWAS associations, by applying a gene-based approach we found limited evidence that the baseline expression of trait-related genes explains GWAS associations, whether using colocalization methods (8% of genes implicated), transcription-wide association (2% of genes implicated), or a combination of regulatory annotations and distance (4% of genes implicated). These results contradict the hypothesis that most complex trait-associated variants coincide with homeostatic expression QTLs, suggesting that better models are needed. The field must confront this deficit and pursue this ‘missing regulation.’

Article activity feed

  1. Author Response

    Reviewer 2 (Public Review):

    1. The hypothesis that the genes responsible for the Mendelian traits are also the causal genes for the cognate complex traits does not seem to hold, given the prior work and the data shown in the study. For example, if this hypothesis is true, it is unexplained why the candidate genes were not even enriched in the GWAS regions for height and breast cancer.

    Following the removal of a data artifact from our breast cancer analysis and the inclusion of Backman et al.’s larger list of genes implicated in height, every phenotype in our analysis displays enrichment in proximity to GWAS peaks. Enrichment is present not only in genes selected based on cognate Mendelian phenotypes, but also on those from Backman et al., which examined the same complex trait phenotypes that were used for GWAS. In that work, the enrichment GWAS signal near of genes selected on coding variants was as high as 59.3-fold.

    Our use of Mendelian-trait-causing genes is not dependent on GWAS. Short of large-scale experimental work, we do not know any better way to confirm the genes’ broad relevance to GWAS phenotypes than their enrichment near peaks. This enrichment has been persuasively demonstrated by previous research. Freund et al. (2019) tested the enrichment of 20 Mendelian disorder gene sets against 62 complex phenotypes. Though there was no statistically significant overlap of phenotypically non-matched Mendelian genes and GWAS peaks (2% matched), the overlap of matched Mendelian genes and GWAS peaks was significant (54% matched).

    We have included additional evidence and references for this relationship in Supp. Note 1.

    1. The only evidence supporting their hypothesis appears to be the enrichment of the candidate genes in the GWAS regions for seven out of the nine traits. However, significant enrichment of the candidate genes in the GWAS regions does not necessarily mean that a large proportion of the candidate genes are the causal genes responsible for the GWAS signals. Analogously, we cannot use the strong enrichment of eQTLs in GWAS regions as evidence to claim that a large proportion of the GWAS signals are driven by eQTLs.

    Our gene sets were selected by considering two criteria: whether they are relevant to each complex trait, and whether they are biologically interpretable.

    The genes identified in Backman et al. have a strong case for relevance. They are evaluated for association, not with cognate Mendelian phenotypes, but with the exact same complex traits used for GWAS.

    Our genes, selected based on cognate Mendelian traits, are less obviously relevant, but have advantages for interpretation. Many have well-understood biological roles and are part of pathways that have been studied in great detail. Because most of these genes can cause dramatic phenotypic changes with one variant, the direction of effect is easier to understand than genes identified through burden testing. In fact, loss-of-function coding variants that cause autosomal dominant traits can be thought of as large-effect, context-independent eQTLs—they cause phenotypic change by decreasing gene expression roughly 50% across cell types, developmental stages, etc.

    Ideal genes for our analysis would combine the advantages of both sets. They would have individual coding variants that could be tied to complex traits using exome sequences. However, natural selection creates tradeoffs between variant frequencies and variant effect sizes. Large-effect variants (such as those responsible for Mendelian traits) are generally too rare to be detected in population sequencing. Coding variants that reach frequencies detectable in databases such as UK Biobank typically have smaller effect sizes, requiring them to be aggregated in order to implicate genes.

    We believe that our original gene set is plausible both because of its collective enrichment in GWAS signal and because each gene is individually known to cause cognate phenotypes. Enrichment is not proof, but can serve as strong evidence when backed up by known biology. Though selection precludes a perfect gene set, the enrichment in both our Mendelian gene set and the set from Backman et al. addresses each criterion—interpretability and relevance—individually, and, taken together, provides an argument for the relevance of genes selected based on coding variants.

    1. Considering the large numbers of GWAS signals, we would expect a substantial number of genes in the GWAS regions by chance. It would be interesting to quantify the number of genes in the GWAS regions if the 143 genes are randomly selected. Correcting the observed number of genes for that expected by chance (e.g., subtracting the observed number by that expected by chance), the proportion of the candidate genes in the GWAS regions would be small.

    The proportion of the candidate genes whose eQTL signals were colocalized with the GWAS signals or in close physical proximity with the fine-mapped GWAS hits was small. However, I would not be surprised if they are significantly enriched, compared with that expected by chance (e.g., quantified by repeated sampling of the 143 genes at random).

    Taking random sets of genes, or the entire set of non-putatively-causative genes shows that, given the size of our gene set, we would expect 43 randomly selected genes to fall within 1 Mb of a peak (95% confidence interval: 31.5-54.5). Instead, we find 147 peak-adjacent genes. When looking closer to genes, the enrichment increases. At a distance of 100 kb, we find 104 putatively causative genes, but the null model predicts only 11 (95% CI 4.5-17.0), a roughly ten-fold difference.

    Enrichment remains significant even when using a more conservative null. It may be that genes like ours, with importance to phenotype, are more likely than random genes to fall near GWAS peaks, even if their phenotype does not correspond to the GWAS phenotype. In this case, we might see enrichment even in the absence of a relationship between our Mendelian and complex traits. To account for this, we also tested significance by testing genes sets against different phenotypes (e.g. testing our LDL genes with a UC GWAS, and our height genes with a T2D GWAS). The results of this permutation are visible in Supp. Fig. 1, and further confirm the enrichment.

    Finally, non-expression based analysis found that Mendelian genes had large enrichments in heritability. As in our study, they included Mendelian genes for diabetes and LDL—the Mendelian diabetes genes were enriched 65-fold for common-variant heritability and the Mendelian LDL genes were enriched 212-fold (Weiner et al. 2022).

    Though it is true that the number of colocalizations and TWAS hits likely represents a statistically significant enrichment over all genes, we feel that this does not affect the conclusions of the paper. The model that noncoding variants identified by GWAS act as eQTLs certainly has some truth—colocalization and TWAS studies have found, in total, many associations. But the model’s success has not lived up to its expectations. This has been suggested, albeit inconclusively, by the failure of most GWAS peaks to colocalize. By evaluating, not the portion of loci that can be tied to a gene, but the portion of already-implicated genes that can be tied to a locus, we believe the model’s deficiencies are both more clear and more puzzling.

    1. It is unclear how the authors selected the breast cancer genes. If the genes were selected based on tumor somatic mutations, it is a problem because there is no evidence supporting that somatic mutation target genes are also cancer germline risk genes.

    Genes for breast cancer were selected using the MutPanning method (Dietlein et al. 2020), which takes somatic mutations found in tumors, and evaluates them in the context of known mutation patterns. The relationship between somatic and germline variants in cancer is little studied. We believe it is meaningful that, as explained in our response to overall comment 2ii, we do now find an enrichment of our breast cancer genes near GWAS peaks. Though these genes are very unlikely to be a perfect set, the conclusions of our paper remain true with or without the inclusion of this phenotype.

    1. The authors observed no enrichment of the candidate genes in height and breast cancer GWAS regions. In this case, should these traits and the corresponding genes be removed from the subsequent analyses?

    The reviewers’ notes about enrichment—and its absence in height and BC—prompted us to review our analysis of it. The enrichment for five of our phenotypes remained significant, and the lack of enrichment for breast cancer genes proved artifactual. After accounting for the artifact, the enrichment of breast cancer genes displays the same pattern as most other phenotypes, displaying highly significant enrichment as compared to the genomic background and a permutation analysis. Supplementary figure 1 has been updated to reflect this change, and to add the enrichments found in Backman et al.

    Because our original analysis of height has nominal, but not corrected, significance for enrichment, the problem may be one of power. The set of height genes identified by Backman et al. is larger than our original set and displays a significant enrichment in proximity to GWAS signal. This enrichment is also present when the two gene sets are combined, as shown in the updated Supp. Figure 1.

    Reviewer 3 (Public Review):

    1. The positive results are substantially reduced when restricting the analyses to a set of selected tissues of relevance to the trait. Isn't it implicated that the selection of relevant tissues in this study is not comprehensive, and further, tissue specificity is common in mediating genetic effects by gene expression? First, it seems some apparently relevant tissues are not selected (Table 2), such as bone for height (Finucane et al. 2015 NG). One approach to assess the relevant tissues for the predefined set of putatively causative genes is to see if these genes are enriched in the differentially expressed gene sets for those tissues. Second, among 84 putatively causative genes overlapped with GWAS signals, they identified 39 genes by TWAS, 11 genes by fine mapping with linear distance to chromatin modification features, and 41 genes by fine mapping with ChromHMM enhancer annotations, but these numbers reduced substantially to 9, 5 and 27 when restricting the same analysis to the selected tissues for each trait. If genes function only in the relevant tissues, I think using bulk expression data would lose power but is unlikely to give false positives. Thus, it is possible that for the traits analysed, not all relevant tissues are selected so that only a fraction of genes identified in bulk expression analysis can be replicated in the tissue-specific analysis. This appears to me a notable piece of evidence to support the hypothesis of biological context that the authors tend to have reservations in discussion.

    Testing for colocalizations or TWAS hits in all tissues may increase power for several reasons. First, it is possible that some GTEx tissues have unrealized relevance to our phenotypes. Secondly, in the event that a tissue is not present in GTEx, we may still detect relevant eQTLs in a tissue that is not itself involved in the trait, but which has similar patterns of expression. Finally, some tissues may be correct, but underpowered due to their small sample size. In this case, we may better detect the colocalization in tissues that are “irrelevant,” but are well-powered and have correlated expression.

    However, this creates problems of interpretation. Say we find, for example, a colocalization of an APOE eQTL with an LDL GWAS peak in skin tissue. Does this mean that skin tissue contributes to LDL levels? Is it simply because skin tissue has more samples than liver? Are we uncovering a strange, unexpected pleiotropy?

    We believe we can achieve both objectives—power and interpretability—with our use of MASH (Urbut et al. 2019) as described in response 3 of the first section. Briefly, MASH is a Bayesian tool that we use to update the estimates of eQTLs in GTEx data. Each tissue is adjusted to incorporate signals detected in other tissues with similar expression. This mitigates the danger of ignoring the correct tissue, and increases the power of tissues with small sample sizes. Its benefit is demonstrated by the substantial increase in the number of expression-GWAS colocalizations identified by coloc—however, the number of genes identified that fall within our putatively causative gene sets remains strikingly small.

    1. How much do both LD differences between GWAS and eQTL samples and the presence of allelic heterogeneity contribute to the observed low colocalization rate? One of their main findings is the low colocalization between trait-associated variants and eQTL in non-coding regions, which accounts for only 7% of the putatively causative genes. In discussion, the authors believe that this finding cannot be explained by lack of statistical power and is directly supported by a Bayesian analysis which reported high posterior probabilities of distinct signals for GWAS and eQTL. I agree that power is probably not a big issue. However, my concern is that given the large difference in sample size between GWAS and GTEx datasets, any small differences in LD between the two samples might cause a statistical separation of the signals even when trait phenotype and gene expression truly share a causal variant. Moreover, the presence of more than one causal variant with allelic heterogeneity in the locus may also play a part in the failure of colocalization. Consider two causal variants for the complex trait, one regulating the target gene and the other regulating another gene in co-expression. Potentially, the presence of the second causal variant would diminish the colocalization probability at the target gene.

    The ability of our statistical tools to actually find colocalizations is a critical one in this project. Small sample size increases the variance of the LD matrix, but is one of only many factors that influence power, which include LD differences between study populations and eQTL effect sizes.

    Though we restricted both GWAS and GTEx samples to subjects with European ancestry and used PCs as covariates, reviewers are correct that there are likely to be LD differences between samples, due to both slight variations in populations and the smaller sample sizes of GTEx. Analysis of colocalization tools in cases of mismatched LD have shown that decreases in power are small. Chun et al. (2017) tested JLIM in simulated conditions of modest population mismatch, using CEU haplotypes to create the GWAS, and haplotypes from all non-Finnish Europeans for eQTL associations. They then attempted to distinguish shared vs. distinct causative variants for GWAS and eQTL, finding no decrease in sensitivity or specificity (Supp. Fig. 6 of Chun et al. 2017).

    The case in which two genes are co-regulated by nearby variants, both causative for the GWAS trait, creates a condition of allelic heterogeneity for the GWAS trait (as opposed to the expression trait). Chun et al. evaluated JLIM’s loss of power as a result of AH, and found that the power loss is small, except in cases in which the two variants have equal effects (Supp. Fig. 10). Testing cases in which the AH occurs for the expression trait returned a similar result (Supp. Fig. 9).

    Hukku et al. (2021) performed similar analyses on coloc, eCAVIAR, and fastENLOC. Allelic heterogeneity was found to damage the power of coloc (by about a factor of 2). Testing on different pairs of populations, they conclude that extreme LD mismatches (e.g. Finnish vs. Yoruban samples) can lead to substantial power loss, but moderate LD mismatches (e.g. Finnish vs. British samples) do not. Though a factor of two is substantial, it would not change the qualitative conclusions of this paper. Overall, given the variety of methods we employ (including those, such as JLIM, more robust to AH), we are confident that they have, when taken together, been shown to be robust to the concerns raised.

    Finally, TWAS should, by design, be less vulnerable to LD differences and allelic heterogeneity. This can result in false positives, when genes with correlated expression are identified together, despite only one being causative. It can also result in non-causative genes being prioritized over causative ones, however, generally both genes will be identified (Wainberg et al. 2019).

    1. Perhaps the authors can perform some simulations to quantify the influence of tissue-specific expression effects, LD differences between eQTL and well-powered GWAS, and allelic heterogeneity, as discussed above, on their analyses. I understand that the authors may not be willing to do as it would involve a lot of work. But I'd like to see at least some discussion on how these questions can be better addressed in the future research.

    These are nuanced technical questions, and to address them by simulation in our paper would, as noted, involve a lot of work. We have summarized previous work that evaluated the effects of LD differences and AH in our response to essential revision 4. We discuss our concerns about the possibility of an overly broad tissue search in essential revisions 3 and 5, and our decision to address this question using MASH in essential revision 3.

    1. It looks quite striking that only 6% of the putatively causative genes are identified by TWAS with the correct effect direction. But I think this number is slightly misleading as one may interpret it as only 6% of the functionally relevant genes are regulated by trait-associated variants. In fact, 46% of the genes are detected by TWAS but only 11% are confirmed in their selected tissues, among which about half (5/9) have correct effect direction. First, the result could be limited by the selection of relevant tissues, as discussed above. Second, the fact that half of the genes do not show correct effect direction may reflect a nonlinear relationship between expression and trait, or the presence of cell-type heterogeneity within a tissue. These may not necessarily overturn the assumption that these genes are regulated by trait-associated variants in the causal tissues or cell types.

    In our initial submission, we had been reluctant to expand the list of tissues for two reasons. First, increasing from the small number of tissues with known biological relevance to all tissues (or all non-brain tissues) increases the multiple-testing correction burden. Second, and, in our eyes, more important, colocalizations in tissues without clear biological relevance are not biologically interprable. Such hits can be results of complicated genetic architecture (e.g. shared eQTLs), power differences in tissues with correlated expression, or biology not directly related to the trait in question.

    That said, the tissue data we have access to are incomplete, and we are without question missing some relevant tissues. Additionally, some relevant tissues have lower sample sizes, and thus lower power, than tissues that are not relevant but may still share eQTLs. To overcome these problems, we applied Multivariate Adaptive Shrinkage (MASH), a Bayesian method that detects correlations between different (in this case tissues) and uses them to produce posterior estimates of summary statistics in each tissue (Urbut et al. 2019). Unlike meta-analysis, which produces one result, the effect size estimates for each tissue are distinct, though informed by one another.

    Using MASH has a pronounced effect on colocalization results. The number of non-putatively causative genes colocalizing increases from 389 to 489, while the number of putatively causative genes in our Mendelian set is unchanged, remaining at 2. The number of genes from the Backman et al. set increases from 2 to 5. Though this is a proportionally large increase, it still represents a small fraction of genes. We have updated our paper to use these results—which should be less dependent on the tissues we selected—but the message has not changed.

    1. While they highlight the roles of alternative regulatory mechanisms, few testable hypotheses are put forward for the field, which is somewhat disappointing but understandable given how little we know about the human genome at the mechanistic level.

    We have added a set of models that may explain the “missing heritability” to Table 4 in the discussion. Though we do not propose experiments, we have included citations for research relevant to confirming or disproving these models.

  2. eLife assessment

    A commonly held hypothesis about how genetic variants predispose to common diseases and other human traits is that variants have phenotypic effects by altering transcript accumulation. The authors question this view by showing some evidence for shared genetic control of transcript abundance for genes believed to be involved in the traits, and for the traits themselves.

  3. Reviewer #1 (Public Review):

    The majority of genetic variants associated with complex human traits reside in the non-coding genome, leading to the assumption that they act through transcriptional regulation. In this work, Connally et al. set to challenge this widespread assumption by showing that genes with plausible links to both severe/familial and common complex forms of the same traits show limited evidence of colocalization with eQTLs or TWAS signals.

    More specifically, they first establish that putatively causative genes for severe or familial forms of human traits are enriched for nearby non-coding variants associated with common complex forms of the same traits. Next, using colocalization in tissues related to these traits, they show that only for 7% of these genes the same variant is driving the trait and gene expression associations. In addition, only 6% of these genes are TWAS hits with correct effect direction. Finally, they provide a thorough discussion of possible causes for lack of colocalization and TWAS hits. Among others, the possibility of the incorrect assumption that underlying biological causes of an extreme phenotypic presentation are similar to the causes of the polygenic form, the lack of statistical power of GWAS, eQTL, and/or colocalization analyses, the lack of the right biological context for the eQTL effect, and alternative regulatory mechanisms.

    The main conclusion of this work, i.e., that the mechanism by which our genes influence complex traits is generally not their baseline expression, is partly justified by the data and results presented here for the seven traits which show a significant overlap of severe/familial and common complex trait genes. The paper introduces a very useful framework to test this hypothesis by leveraging the joint signals from extreme and polygenic forms of disease to build some form of a set of true positive cases, in which the gene driving trait variation is known. The study also opens up a lot of interesting discussions about alternative hypotheses to fill this gap of 'missing regulation'.

    However, the very limited number of traits studied, and the possible alternative explanation of their results, especially by the combination of lack of power and the right biological context, severely limit the generalization of their main conclusion across all/most complex human traits. Adding more traits would be needed to increase confidence in and generalizability of the results supporting the main conclusion. In addition, the study is testing for colocalization with eQTLs identified in bulk post-mortem adult tissues. However, several studies have shown that cell type-dependent/specific eQTLs (Westra et al PLoS Genet 2015, Zhernakova et al. NatGen 2017, Lu et al BioRxiv 2021), as well as response eQTLs (Moyerbrailean et al Genome Res 2016), are particularly enriched in disease association. Due to the limited number of well-powered response or single-cell eQTL studies, it is yet unclear how many of these eQTLs are captured by steady-state bulk tissue eQTLs. This, in combination with the low power of colocalization analyses (Barbeira et al. BioRxiv 2020), is also a very likely explanation of the lack of colocalization of (putatively causative) genes reported here. A better understanding of the degree to which these findings are driven by a lack of sufficiently granular eQTLs is needed.

  4. Reviewer #2 (Public Review):

    In this article entitled "The missing link between genetic association and regulatory function", Connally and colleagues attempted to quantify the extent to which genetic variants affect complex traits by altering the expression levels of putative causal genes. They focused on nine complex traits (including four common diseases) for which large-scale GWAS data were available. They curated 143 candidate genes (127 unique genes) for Mendelian forms of the traits under the assumption that genes causing the Mendelian form of the complex traits should also be the genes influencing complex trait variation in the general population. They found enrichment of the candidate genes in the GWAS regions (+/- 1Mb of a genome-wide significant signal) for all the complex traits but height and breast cancer. They then investigated the proportion of the candidate genes whose eQTL signals are colocalized with the GWAS signals for the nine traits, the proportion of the genes in close physical proximity with the fine-mapped GWAS variants, and the proportion of genes whose functionally active regions annotated using chromatin modification and activity data are overlapped the fine-mapped GWAS variants. All the proportions appeared to be small.

    Major comments

    The hypothesis that the genes responsible for the Mendelian traits are also the causal genes for the cognate complex traits does not seem to hold, given the prior work and the data shown in the study. For example, if this hypothesis is true, it is unexplained why the candidate genes were not even enriched in the GWAS regions for height and breast cancer.

    The only evidence supporting their hypothesis appears to be the enrichment of the candidate genes in the GWAS regions for seven out of the nine traits. However, significant enrichment of the candidate genes in the GWAS regions does not necessarily mean that a large proportion of the candidate genes are the causal genes responsible for the GWAS signals. Analogously, we cannot use the strong enrichment of eQTLs in GWAS regions as evidence to claim that a large proportion of the GWAS signals are driven by eQTLs.

    Considering the large numbers of GWAS signals, we would expect a substantial number of genes in the GWAS regions by chance. It would be interesting to quantify the number of genes in the GWAS regions if the 143 genes are randomly selected. Correcting the observed number of genes for that expected by chance (e.g., subtracting the observed number by that expected by chance), the proportion of the candidate genes in the GWAS regions would be small.

    The proportion of the candidate genes whose eQTL signals were colocalized with the GWAS signals or in close physical proximity with the fine-mapped GWAS hits was small. However, I would not be surprised if they are significantly enriched, compared with that expected by chance (e.g., quantified by repeated sampling of the 143 genes at random).

    It is unclear how the authors selected the breast cancer genes. If the genes were selected based on tumor somatic mutations, it is a problem because there is no evidence supporting that somatic mutation target genes are also cancer germline risk genes.

    The authors observed no enrichment of the candidate genes in height and breast cancer GWAS regions. In this case, should these traits and the corresponding genes be removed from the subsequent analyses?

  5. Reviewer #3 (Public Review):

    Connally et al investigated a central question in complex trait genomics - what's the main mechanism that mediates the effects of trait-associated variants in non-coding regions, which harbour most of the signals identified by genome-wide association studies (GWAS). It is widely perceived that these variants affect trait phenotypes by regulating expression of genes in cis that are functionally relevant to the trait. The authors argue that this is not true because they find limited evidence of linking the trait-associated non-coding variants to a set of putatively causative genes that are known to cause the severe form of the complex trait. The authors discussed four possible explanations to their observations. They argue that incorrect assumptions and lack of statistical power are not likely to be critical, withhold their judgment on the biological context, and claim that the most convincible explanation is the existence of alternative regulatory mechanisms. This conclusion is very important and sobering if it is true because it will inform where to invest the most efforts in the future GWAS.

    It is an interesting idea of using genes of known roles in the "Mendelian forms" of the cognate complex traits as true positives to investigate the biology of non-coding variants. The analyses are done carefully. The discussion of the results is sharp, stands high, and provides lots of food for thought. My major comments lie in the strength of support of their results for the conclusion of "missing regulation" likely attributed to alternative regulatory mechanisms. The results presented seem to also support the biological context hypothesis that non-coding variants regulate gene expression in a tissue or cell type-specific manner.

    Major comments:

    The positive results are substantially reduced when restricting the analyses to a set of selected tissues of relevance to the trait. Isn't it implicated that the selection of relevant tissues in this study is not comprehensive, and further, tissue specificity is common in mediating genetic effects by gene expression?
    First, it seems some apparently relevant tissues are not selected (Table 2), such as bone for height (Finucane et al. 2015 NG). One approach to assess the relevant tissues for the predefined set of putatively causative genes is to see if these genes are enriched in the differentially expressed gene sets for those tissues. Second, among 84 putatively causative genes overlapped with GWAS signals, they identified 39 genes by TWAS, 11 genes by fine mapping with linear distance to chromatin modification features, and 41 genes by fine mapping with ChromHMM enhancer annotations, but these numbers reduced substantially to 9, 5 and 27 when restricting the same analysis to the selected tissues for each trait. If genes function only in the relevant tissues, I think using bulk expression data would lose power but is unlikely to give false positives. Thus, it is possible that for the traits analysed, not all relevant tissues are selected so that only a fraction of genes identified in bulk expression analysis can be replicated in the tissue-specific analysis. This appears to me a notable piece of evidence to support the hypothesis of biological context that the authors tend to have reservations in discussion.

    How much do both LD differences between GWAS and eQTL samples and the presence of allelic heterogeneity contribute to the observed low colocalization rate?
    One of their main findings is the low colocalization between trait-associated variants and eQTL in non-coding regions, which accounts for only 7% of the putatively causative genes. In discussion, the authors believe that this finding cannot be explained by lack of statistical power and is directly supported by a Bayesian analysis which reported high posterior probabilities of distinct signals for GWAS and eQTL. I agree that power is probably not a big issue. However, my concern is that given the large difference in sample size between GWAS and GTEx datasets, any small differences in LD between the two samples might cause a statistical separation of the signals even when trait phenotype and gene expression truly share a causal variant. Moreover, the presence of more than one causal variant with allelic heterogeneity in the locus may also play a part in the failure of colocalization. Consider two causal variants for the complex trait, one regulating the target gene and the other regulating another gene in co-expression. Potentially, the presence of the second causal variant would diminish the colocalization probability at the target gene.

    Perhaps the authors can perform some simulations to quantify the influence of tissue-specific expression effects, LD differences between eQTL and well-powered GWAS, and allelic heterogeneity, as discussed above, on their analyses. I understand that the authors may not be willing to do as it would involve a lot of work. But I'd like to see at least some discussion on how these questions can be better addressed in the future research.

    It looks quite striking that only 6% of the putatively causative genes are identified by TWAS with the correct effect direction. But I think this number is slightly misleading as one may interpret it as only 6% of the functionally relevant genes are regulated by trait-associated variants. In fact, 46% of the genes are detected by TWAS but only 11% are confirmed in their selected tissues, among which about half (5/9) have correct effect direction. First, the result could be limited by the selection of relevant tissues, as discussed above. Second, the fact that half of the genes do not show correct effect direction may reflect a nonlinear relationship between expression and trait, or the presence of cell-type heterogeneity within a tissue. These may not necessarily overturn the assumption that these genes are regulated by trait-associated variants in the causal tissues or cell types.

    While they highlight the roles of alternative regulatory mechanisms, few testable hypotheses are put forward for the field, which is somewhat disappointing but understandable given how little we know about the human genome at the mechanistic level.