Sinle-Cell Transcriptomics and Machine Learning Algorithms Unveil Metastasis-Associated Cellular Subtypes and Prognostic Signatures in Colorectal Cancer

Read the full article See related articles

Listed in

This article is not in any list yet, why not save it to one of your lists.
Log in to save this article

Abstract

Background Colorectal cancer (CRC) is a prevalent digestive tract malignancy, with liver metastasis occurring in up to 50% of cases. Identifying reliable early metastasis markers is crucial for improving CRC prognosis. Methods In this study, we analyzed single-cell RNA sequencing data from CRC patients, including primary tumors, adjacent normal tissues, and liver metastases. Copy number variation (CNV) analysis using CopyKAT algorithm distinguished tumor from non-tumor cells. We identified key tumor subtypes influencing metastasis through differential gene expression and pathway analyses. Leveraging 103 machine learning algorithms, we developed a metastasis-associated risk model based on identified biomarkers. The model was validated across multiple external datasets.. Results We delineated five tumor cell subtypes, with EMP1 + cells emerging as a key subtype in CRC metastasis. The machine learning approach identified a five-gene signature (SPINK1, PLAC8, LAMB3, CEACAM5, CDA) for metastasis risk prediction. The risk model significantly stratified patients into high- and low-risk groups across six independent cohorts, with high-risk scores correlating with poorer survival. Gene set enrichment analysis revealed enrichment of epithelial-mesenchymal transition (EMT) pathways in the high-risk group. Mutation analysis showed higher overall mutation frequencies in the high-risk group, particularly in genes like APC, TP53, and KRAS. Conclusion Our single-cell transcriptomics and machine learning approach uncovered novel cellular subtypes and a gene signature associated with CRC metastasis, providing new insights for early diagnosis and potential therapeutic targets.

Article activity feed