CNN-based learning of single-cell transcriptomes reveals a blood-detectable multi-cancer signature of brain metastasis
Curation statements for this article:-
Curated by eLife
eLife Assessment
This important study describes a deep learning framework that analyzes single-cell RNA data to identify a tumor-agnostic gene signature associated with brain metastases. The identified signature uncovers key molecular mechanisms, highlights potential therapeutic targets, and demonstrates a metastasis-specific transcriptional signal in circulating platelets, suggesting its promise for non-invasive diagnostics through liquid biopsy. The evidence supporting the findings is solid, utilizing interpretable deep learning methodologies and large-scale datasets across multiple cancer types, though some aspects may benefit from additional analysis and validation.
This article has been Reviewed by the following groups
Discuss this preprint
Start a discussion What are Sciety discussions?Listed in
- Evaluated articles (eLife)
Abstract
Brain metastasis (BrM) is a serious complication of advanced cancers and remains difficult to predict before clinical symptoms appear. To investigate shared transcriptional features of BrM across tumour types, we integrated single-cell RNA sequencing (scRNA-seq) data from malignant epithelial cells derived from six carcinoma types, including lung, breast, colorectal, renal, prostate, and melanoma. We applied ScaiVision, a supervised representation learning method, to classify tumour samples based on BrM status. The models achieved high predictive accuracy (area under the ROC curve > 0.90) across all six cancer types. This analysis identified a consistent multi-cancer gene expression signature associated with BrM, defined at single-cell resolution. To evaluate the clinical relevance of this signature, we assessed its presence in tumour-educated platelets (TEPs) from blood samples of patients with and without BrM. The signature was detectable in platelet RNA and distinguished patients with BrM from those without, indicating that features of the BrM-associated expression program are reflected in blood-derived material. These findings demonstrate that a transcriptional signature of brain metastasis can be identified across multiple tumour types using scRNA-seq and neural network-based analysis. The detectability of this signature in TEPs supports its relevance in a non-invasive context and provides a basis for further investigation into its utility for BrM risk assessment.
Article activity feed
-
eLife Assessment
This important study describes a deep learning framework that analyzes single-cell RNA data to identify a tumor-agnostic gene signature associated with brain metastases. The identified signature uncovers key molecular mechanisms, highlights potential therapeutic targets, and demonstrates a metastasis-specific transcriptional signal in circulating platelets, suggesting its promise for non-invasive diagnostics through liquid biopsy. The evidence supporting the findings is solid, utilizing interpretable deep learning methodologies and large-scale datasets across multiple cancer types, though some aspects may benefit from additional analysis and validation.
-
Reviewer #1 (Public review):
Summary:
This paper applies ScaiVision, a convolutional neural network (CNN)-based supervised representation learning method, to single-cell RNA sequencing (scRNA-seq) data from six carcinoma types. The goal is to identify a pan-cancer gene expression signature of brain metastasis (BrM) that is both interpretable and clinically useful. The authors report:
(1) High classification accuracy for distinguishing primary tumours from brain metastases (AUC > 0.9 in training, > 0.8 in validation).
(2) Discovery of a 173-gene BrM signature, with a robust top-20 core.
(3) Evidence that the BrM signature is detectable in tumour-educated platelets (TEPs), enabling a potential non-invasive biomarker.
(4) Mechanistic analyses implicating VEGF-VEGFR1 signaling and ETS1 as central drivers of BrM.
(5) A computational drug …
Reviewer #1 (Public review):
Summary:
This paper applies ScaiVision, a convolutional neural network (CNN)-based supervised representation learning method, to single-cell RNA sequencing (scRNA-seq) data from six carcinoma types. The goal is to identify a pan-cancer gene expression signature of brain metastasis (BrM) that is both interpretable and clinically useful. The authors report:
(1) High classification accuracy for distinguishing primary tumours from brain metastases (AUC > 0.9 in training, > 0.8 in validation).
(2) Discovery of a 173-gene BrM signature, with a robust top-20 core.
(3) Evidence that the BrM signature is detectable in tumour-educated platelets (TEPs), enabling a potential non-invasive biomarker.
(4) Mechanistic analyses implicating VEGF-VEGFR1 signaling and ETS1 as central drivers of BrM.
(5) A computational drug repurposing screen highlighting pazopanib as a candidate therapeutic.
Strengths:
(1) Biological scope:
Integration of six tumour types highlights shared mechanisms of brain metastasis, beyond tumour-specific studies.
(2) Interpretability:
Use of integrated gradients on ScaiVision models identifies genes that drive classification, linking predictions to interpretable biology.
(3) Multi-modal validation:
BrM signature validated across scRNA-seq, spatial transcriptomics, pseudotime analyses, and liquid biopsy data.
(4) Translational potential:
Detection in TEPs provides a promising path toward a blood-based biomarker.
(5) Therapeutic angle:
Drug repurposing analysis identifies VEGF-targeting compounds, with pazopanib highlighted.
Weaknesses:
(1) Methodological contribution is limited:
ScaiVision is an existing proprietary framework; the paper does not introduce a new method.
No baseline comparisons (e.g., logistic regression, random forest, scVI, simple MLP) are presented, so the added value of CNNs over simpler models is unclear.
(2) Data constraints:
The dataset size is modest (115 samples, of which 21 are BrM), though thousands of cells per sample.
Training relies on patient-level labels, with subsampling to generate examples - a multi-instance learning setup that could be benchmarked more explicitly.
(3) Validation gaps:
Biomarker detection in platelets is based on retrospective bulk RNA-seq; no prospective patient validation is included.
Mechanistic claims (ETS1, VEGF) are computational inferences without wet-lab validation.
-
Reviewer #2 (Public review):
Summary:
This important study describes a deep learning framework that analyzes single-cell RNA data to identify tumor-agnostic gene signature associated with brain metastases. The identified signature uncovers key molecular mechanisms like VEGF signaling and highlights its potential therapeutic targets. It also assessed the performance of the gene signature in liquid biopsy and showed that the brain metastases signature yields a robust, metastasis-specific transcriptional signal in circulating platelets, suggesting potential for non-invasive diagnostics.
Strengths:
(1) The approach is multi-cancer, identifying mechanisms shared across diseases beyond tumor-specific constraints.
(2) Robust and explainable deep learning method workflow that utilized scRNA-seq data from various cancer types, demonstrating …
Reviewer #2 (Public review):
Summary:
This important study describes a deep learning framework that analyzes single-cell RNA data to identify tumor-agnostic gene signature associated with brain metastases. The identified signature uncovers key molecular mechanisms like VEGF signaling and highlights its potential therapeutic targets. It also assessed the performance of the gene signature in liquid biopsy and showed that the brain metastases signature yields a robust, metastasis-specific transcriptional signal in circulating platelets, suggesting potential for non-invasive diagnostics.
Strengths:
(1) The approach is multi-cancer, identifying mechanisms shared across diseases beyond tumor-specific constraints.
(2) Robust and explainable deep learning method workflow that utilized scRNA-seq data from various cancer types, demonstrating solid predictive accuracy.
(3) The detection of the BrM signature in tumor-educated platelets (TEPs) indicates a promising avenue for developing liquid biopsy assays, which could significantly enhance early detection capabilities.
Weaknesses:
(1) The paper lacks a thorough comparison with other reported signatures in the literature, which could help contextualize the performance and uniqueness of the authors' findings.
(2) The model training focused solely on epithelial cells, potentially overlooking critical contributions from stromal and immune cell types, which could provide a more comprehensive understanding of the tumor microenvironment.
(3) While the results are promising, there is a need for validation across tumor types not included in the training set to assess the generalizability of the signature.
Achievements:
The authors have made significant progress toward their aims, successfully identifying a transcriptional signature that is associated with brain metastasis across multiple cancer types. The results support their conclusions, showcasing the BrM signature's ability to distinguish between metastatic and primary tumor cells and its potential usability as a non-invasive biomarker.
This study has the potential to make a substantial impact in oncological research and clinical practice, particularly in the management of patients at risk for brain metastasis. The identification of a gene signature applicable across various tumor types could lead to the development of standardized diagnostic tools for early detection. Moreover, the emphasis on non-invasive diagnostic techniques aligns well with the current trends in precision medicine, making the findings highly relevant for the broader medical community.
-
Reviewer #3 (Public review):
Summary:
The article develops a CNN-based metastasis scoring system to distinguish cell subsets with high brain metastatic potential and validates its performance using patient platelet data. The robustness of this approach is further demonstrated across diverse single-cell and spatial datasets from multiple cancers, supported by transcription factor and gene set analyses, as well as novel drug identification pipelines. Together, these findings provide strong evidence that reinforces the central theme of the study.
Strengths:
Development of a CNN-based scoring system to reveal the potential of brain metastasis that is robust across multiple cancer cell types, validated by multiple datasets. Other approaches, including transcription factor analyses, cell-cell communication analysis, and spatial …
Reviewer #3 (Public review):
Summary:
The article develops a CNN-based metastasis scoring system to distinguish cell subsets with high brain metastatic potential and validates its performance using patient platelet data. The robustness of this approach is further demonstrated across diverse single-cell and spatial datasets from multiple cancers, supported by transcription factor and gene set analyses, as well as novel drug identification pipelines. Together, these findings provide strong evidence that reinforces the central theme of the study.
Strengths:
Development of a CNN-based scoring system to reveal the potential of brain metastasis that is robust across multiple cancer cell types, validated by multiple datasets. Other approaches, including transcription factor analyses, cell-cell communication analysis, and spatial transcriptomic, etc., were included to strengthen the work.
Weaknesses:
The author could identify/validate more signaling pathways beyond the VEGF pathway since it's well known in metastasis.
-
Reviewer #4 (Public review):
Summary:
This work provides a gene signature for brain metastases derived from an integrated single-cell dataset of six carcinomas. A key rationale for their approach is the notion that metastases originating from different organs may converge upon a similar set of transcriptional states, representing shared functional and developmental programs. By combining primary tumor and metastatic brain tumor, the authors leverage an interpretable deep-learning approach to identify a multi-cancer single-cell dataset to predict brain metastases from a primary tumor that is more robust and generalizable than a signature derived from an individual cancer type. They employ a variety of single-cell tools to identify a putative mechanism of action for metastatic progression to the brain involving VEGF-related signaling, and …
Reviewer #4 (Public review):
Summary:
This work provides a gene signature for brain metastases derived from an integrated single-cell dataset of six carcinomas. A key rationale for their approach is the notion that metastases originating from different organs may converge upon a similar set of transcriptional states, representing shared functional and developmental programs. By combining primary tumor and metastatic brain tumor, the authors leverage an interpretable deep-learning approach to identify a multi-cancer single-cell dataset to predict brain metastases from a primary tumor that is more robust and generalizable than a signature derived from an individual cancer type. They employ a variety of single-cell tools to identify a putative mechanism of action for metastatic progression to the brain involving VEGF-related signaling, and find some evidence supporting this hypothesis in spatial data. A drug repurposing analysis is performed to identify a potential therapeutic candidate for VEGF-driven brain metastasis, and they demonstrate an intriguing possibility for using their brain metastasis signature in a blood-based test in the clinic.
Strengths:
An interpretable deep-learning approach allows both for high-accuracy classification of brain metastases from primary tumors and the identification of a gene signature. Much work goes into validating the gene signature in different contexts and different modalities, and presents a cohesive picture of metastasis progression. The analysis highlighting certain cells within the primary tumor that may be more likely to metastasize is interesting, and the demonstration of the difference in mean expression of their signature in bulk RNASeq of tumor-educated platelets (TEPs) has strong implications for the clinic.
Weaknesses:
The authors derive the signature from cancerous epithelial cells, citing a desire to avoid bias from differences in cellular composition; yet much of the downstream analysis is performed across different cancer types and different cell types; differential analysis was then performed between the highest scoring cells vs lowest scoring cells, but there does not appear to be any consideration/adjustment for cell type composition at this stage, which could bias results. Given that the signature was initially identified in epithelial cells, there seems to be a leap to applying the signature to immune and stromal compartments. Perhaps the proof is in the pudding, yet it raises the question of what would have happened if the authors had not restricted the initial step of their signature generation to the epithelial cells.
In addition, although a cohesive story around VEGF is presented, VEGF was merely one of the several signaling pathways upregulated. There were quite a few others (ANGPT, CDH1, CADM, IGF), which are not addressed by the authors. VEGF is, of course, very well studied, and while the authors do distinguish their signature from VEGF in the context of TEP, it leaves open the question of whether one of the other highlighted genes may be equally powerful and more feasible (because there are fewer genes) to get into the clinic.
The cell-cell communication analysis seems somewhat weak, although using a standard set of tools. Most of the analysis was done based on single-cell data, without the spatial context, and the authors highlighted epithelial cells as the senders for the VEGF pathway; yet in the Visium data, the expression of the signature seems highest in non-tumor cells, and the strongest interactions seem to be quite spatially separated (Figure 5C and 5E).
-
-
-