Instance-based Transfer Learning Enables Cross-Cohort Early Detection of Colorectal Cancer
Discuss this preprint
Start a discussion What are Sciety discussions?Listed in
This article is not in any list yet, why not save it to one of your lists.Abstract
Colorectal cancer (CRC) continues to be a major global public health challenge. Extensive research has underscored the critical role of the gut microbiome for diagnostics of CRC. However, early-stage prediction of CRC, particularly at the precancerous adenomas (ADA) stage, remains challenging due to the instability of microbial features across cohorts. In this study, we conducted a systematic analysis of 2,053 gut metagenomes from 14 globally-sampled public cohorts and a newly recruited cohort. Despite substantial regional and cohort-level heterogeneity in microbiome composition, we elucidated that the consistent dynamic patterns of microbial signatures are the fundamental for CRC detection. These patterns enabled robust performance in both inter-cohort and independent validations using an optimized bioinformatics framework. In contrast, such basis was lacking in ADA-associated microbial markers, limiting the generalizability of early detection models. To address this, we developed an instance-based transfer learning approach, Meta-iTL, which effectively leveraged knowledge from existing datasets to detect CRC risk at the ADA stage in the newly recruited cohort. Thus, Meta-iTL overcomes challenges posed by cohort-specific variability and limited data availability, advances the application of non-invasive approaches for the early screening and prevention of CRC.