Transcriptomic Signatures Specific to Thyroid Cancer Subtypes via Computational Clustering
Listed in
This article is not in any list yet, why not save it to one of your lists.Abstract
Introduction: Thyroid cancer, exhibits distinct histopathological and molecular profiles that dictate clinical behavior. Advances in next-generation sequencing have elucidated subtype-specific genomic and transcriptomic alterations, enabling the classification of papillary (PTC), follicular (FTC), medullary (MTC), and anaplastic thyroid carcinoma (ATC). Despite progress, a significant gap remains in systematically integrating transcriptomic signatures with clinically actionable outcomes across all subtypes, particularly in resolving intra-tumoral heterogeneity and linking molecular profiles to therapeutic responses. Objective : To harness AI-driven clustering to identify subtype-specific transcriptomic signatures using large-scale datasets, such as The Cancer Genome Atlas (TCGA). Method : Transcriptomic datasets from TCGA thyroid cancer cohort (PTC, FTC, MTC, ATC) were preprocessed. scRNA-seq data were integrated (Seurat, DoubletFinder, Harmony) for single-cell resolution. Unsupervised clustering identified molecular subtypes and DEGs (Wilcoxon rank-sum, false discovery rate). Machine learning (ML) models predicted outcomes (10-fold cross-validation, AUC-ROC). Clinical integration (Cox models, Kaplan-Meier) and validation (GEO, CRISPR, immunohistochemistry) confirmed signatures. Reproducible pipelines (GitHub) ensured consistency. Results : Transcriptomic datasets from TCGA thyroid cancer cohort (500 samples) were preprocessed (Q30 > 90%, alignment > 85%, DESeq2, ComBat). scRNA-seq integration (25,000 cells) identified 12 cell types, with ATC showing immunosuppressive myeloid cells (p < 0.001). Unsupervised clustering revealed four molecular subtypes and 1,250 DEGs (BRAF, RET, TP53, PTEN). ML models (random forest, SVM) achieved high accuracy (AUC-ROC: 0.92, 0.89), identifying a 50-gene signature. Clinical integration linked high-risk subtypes to poor survival (HR: 2.5, p < 0.001). Validation (GEO, CRISPR, IHC) confirmed signature robustness (AUC-ROC: 0.89–0.93). Reproducible pipelines were shared via GitHub. Conclusion : This study identified robust transcriptomic signatures and subtype-specific ecosystems in thyroid cancer, validated through computational clustering, ML, and functional assays. Thus, this study advances in precision oncology by linking molecular profiles to clinical outcomes, supported by reproducible pipelines and high-performance computing.