GBoost-CTL: A novel method in multi-tissue transcriptome-wide associations studies in cross-tissue learner incorporating GWAS information

Read the full article See related articles

Discuss this preprint

Start a discussion What are Sciety discussions?

Listed in

This article is not in any list yet, why not save it to one of your lists.
Log in to save this article

Abstract

Genome-wide association studies (GWAS) have uncovered numerous genetic variants linked to complex human diseases, yet linking these variants to transcripts and tissues that drive pathology remains difficult. Multi-tissue transcriptome-wide association studies (TWAS) offer a powerful bridge, but existing analytical methods have some limitations, either by discarding important signals by separately analyzing and then aggregating results across tissues, implying imputation models in individual tissues, or fusing them with weights that ignore how much GWAS signal each tissue actually carries. Therefore, most of the existing methods do not work uniformly across different GWAS cohorts. Here, we propose GBoost-CTL - a GWAS-boosted cross-tissue learner that can overcome those aforementioned limitations. The method starts with any collection of single-tissue learners (STLs), allowing investigators to choose the most suitable imputation engine for each tissue. It then (i) allocates weights according to each STL’s out-of-sample predictive accuracy and (ii) refines those weights incorporating the GWAS-derived information, so that informative tissues are automatically up-weighted while uninformative tissues are down-weighted. This dual weighting strategy lets GBoost CTL adapt to fully shared, partially shared, or highly tissue-specific regulatory architectures while preserving nominal type I error control and delivering substantially higher power than existing linear or covariance-based methods. Through extensive simulation, we have found that this dual weighting strategy lets GBoost-CTL adapt to fully shared, partially shared, or highly tissue-specific regulatory architectures while preserving nominal type I error control and delivering substantially higher power than existing linear or covariance-based methods. When applied to real data, GBoost-CTL consistently outperformed some existing multi-tissue TWAS methods (e.g., TWAS-CTL, UTMOST and PrediXcan) by identifying a greater number of disease-associated genes with more stringent p-values. Given its modular design, computational scalability, and demonstrable gains in discovery power, we believe that GBoost-CTL offers a practical tool for the analysis of multi-tissue TWAS.

Article activity feed