Robust Prediction of Patient-Specific Cancer Hallmarks Using Neural Multi-Task Learning: a model development and validation study

Read the full article See related articles

Listed in

This article is not in any list yet, why not save it to one of your lists.
Log in to save this article

Abstract

Background Accurate quantification of cancer hallmark activity is essential for understanding tumor progression, tailoring treatments, and improving patient outcomes. Traditional methods, such as histopathological grading and immunohistochemistry for protein expression, often overlook the complex interplay between cancer cells and the tumor microenvironment and provide limited insight into hallmark-specific mechanisms. We aimed to develop OncoMark, a high-throughput deep learning-enabled neural multi-task learning framework capable of systematically quantifying integrative hallmarks activities using transcriptomics data from routine tumor biopsies. Methods In this study, we acquired single-cell transcriptomics data from 941 tumor samples across 14 tissue types, comprising nearly 3.1 million cells from 56 studies conducted worldwide, to form a large multicenter dataset. Our model employs a supervised neural multi-task learning method designed to predict multiple cancer hallmarks present in the biopsy samples simultaneously. The OncoMark model was developed and tested on 90% of the studies (patients from 51 studies) using repeated five-fold cross-validation performed twice. For further evaluation, the model was assessed on the remaining 10% of the studies (patients from 5 studies) that were excluded from the initial training and testing dataset. Additionally, we included patients from publicly available datasets, including TCGA, GTEx, ANTE, MET500, POG570, CCLE, TARGET, and PCAWG to validate the model's performance. The primary objective was to evaluate the performance of the model in identifying cancer hallmarks in cancer datasets and ensure no hallmark predictions were made in normal samples across the four prespecified groups: (i) internal test set, (ii) external test set, (iii) normal samples (real-world), and (iv) cancer samples (real-world). Findings OncoMark demonstrated exceptional performance in predicting cancer hallmark states, achieving near-perfect accuracy across internal test data and five external test datasets. Internal testing consistently showed accuracy, precision, recall, and F1 scores exceeding 99%, underscoring the model's reliability across hallmarks. External test further confirmed these findings, with accuracy, precision, recall, F1 scores, and balanced accuracy consistently exceeding 96.6%, and multiple datasets achieving perfect scores, highlighting the model's exceptional generalizability and robustness. Specificity tests using GTEx and ANTE datasets accurately classified normal tissues, while sensitivity analysis on TCGA, MET500, CCLE, TARGET, PCAWG, and POG570 datasets effectively identified cancer hallmarks. Interpretation We developed an AI-based framework that enables accurate, efficient, and cost-effective quantification of cancer hallmark activity directly from transcriptomics data. The framework demonstrated significant potential as an assistive tool for guiding personalized treatment strategies and advancing the clinical management of cancer patients.

Article activity feed