scTransMIL Bridges Patient-Level Phenotypes and Single-Cell Transcriptomics for Cancer Screening and Heterogeneity Inference

Read the full article See related articles

Listed in

This article is not in any list yet, why not save it to one of your lists.
Log in to save this article

Abstract

Single-cell sequencing technology has revolutionized cancer research by revealing unprecedented insights into tumor heterogeneity. However, reliably connecting patient-level cancer phenotypes with single-cell transcriptomic profiles remains challenging, due to technical constraints and labeling ambiguity. This further hinders the precise cancer screening and intensive study of tumor mechanism based on single-cell sequencing. To bridge this gap, we introduce scTransMIL, a sc RNA-seq Trans former-based M ulti- I nstance L earning framework that learns whole-genome context to deliver comprehensive cancer insights at the sample, cell, and gene levels across three biological scales: (1) accurate patient-level cancer phenotype prediction, (2) precise single-cell disease scoring (validated on 4 million single cells), and (3) genome-wide biomarker discovery. Benchmark experiments demonstrate scTransMIL’s exceptional performance, including robust out-of-distribution generalization and clinically relevant prediction of metastatic tissue-of-origin, a crucial capability for identifying cancers of unknow primary. At single-cell resolution, scTransMIL identified both known and novel biomarkers for tumor B cells that conventional differential expression analysis failed to detect while maintaining consistent concordance with malignant cell annotations. scTransMIL’s adaptability is exemplified in acute myelocytic leukemia, where minimal fine-tuning with only a few patient sample labels enabled: (i) discovery of novel disease subtypes with distinct clinical outcomes, (ii) reconstruction of differentiation trajectories at the single-cell resolution, and (iii) identification of subtype-specific gene signatures. In summary, by systematically linking cellular and molecular profiles with clinical disease phenotypes, scTransMIL emerges as a transformative tool poised to advance both basic cancer research and precision oncology applications.

Article activity feed