Identifying tissue states by spatial protein patterns related to chemotherapy response in triple-negative breast cancer

Curation statements for this article:
  • Curated by eLife

    eLife logo

    eLife Assessment

    This is an important work implementing data mining methods on IMC data to discover spatial protein patterns related to the triple-negative breast cancer patients' chemotherapy response. The evidence supporting the claims of the authors is solid, although more detailed methodology clarification and validation are needed. While the accuracy of the methods is not very high, the work shows potential for translational application.

This article has been Reviewed by the following groups

Read the full article See related articles

Discuss this preprint

Start a discussion What are Sciety discussions?

Abstract

Triple-negative breast cancer (TNBC) is an aggressive malignancy with limited targeted therapies and variable responses to conventional chemotherapy, influenced by intratumoral heterogeneity and complex tumor microenvironment (TME) interactions. Understanding spatiotemporal cellular interplay and tissue organization is crucial for advancing tumor biology and improving patient stratification. Spatially resolved proteomics, such as Imaging Mass Cytometry (IMC), offers a powerful approach to dissect the TME. We present an end-to-end computational pipeline for robust quantitative analysis of large-scale IMC datasets, addressing the challenge of batch effects through image-level contrast adjustment. Applying this framework to 813 tissue regions encompassing over 4 million cells from 63 TNBC patients, we revealed distinct spatial arrangements of cell types between chemotherapy responders and non-responders. Non-responders showed reduced cytotoxic T-cell infiltration into tumor regions and increased spatial co-localization between fibroblasts and macrophages, a pattern that persisted and intensified after chemotherapy treatment. To integrate these complex spatial-molecular relationships, we used graph neural networks (GNNs) to predict treatment response from pre-treatment samples with AUROC=0.71. Interpretability analysis identified B7H4, CD11b, CD366, and FOXP3 as the most predictive protein markers, with fibroblasts, cancer cells, and CD8+ T cells being the most informative cell types. This study introduces a scalable analytical framework for spatial proteomics with interpretable predictions, suggesting features of tissue state that could guide treatment decisions in TNBC and further our understanding of the spatial determinants of therapeutic response.

Article activity feed

  1. eLife Assessment

    This is an important work implementing data mining methods on IMC data to discover spatial protein patterns related to the triple-negative breast cancer patients' chemotherapy response. The evidence supporting the claims of the authors is solid, although more detailed methodology clarification and validation are needed. While the accuracy of the methods is not very high, the work shows potential for translational application.

  2. Reviewer #1 (Public review):

    Summary:

    The study presents a computational pipeline for Imaging Mass Cytometry (IMC) analysis in triple-negative breast cancer (TNBC). Analyzing over 4 million cells from 63 patients, it uncovers a distinct spatial organization of cell types between chemotherapy responders and non-responders. Using graph neural networks, the framework predicts treatment response from pre-treatment samples and identifies key predictive protein markers and cell types associated with therapeutic outcomes.

    Strengths:

    (1) The study presents a novel framework leveraging Imaging Mass Cytometry (IMC) to investigate spatial patterns and differences among patient groups, which has been rarely explored.

    (2) It uncovers several compelling biological insights, providing a deeper understanding of the complex interactions within the tumor microenvironment.

    (3) The analysis pipeline is comprehensive, incorporating batch correction, cell type clustering, and a graph neural network based on cell-cell interactions to predict chemotherapy response, demonstrating methodological innovation and thoughtful design.

    Weaknesses:

    (1) Some figure references are inconsistent. For example, Figure 4C is cited on Page 11, but it does not appear in the manuscript.

    (2) Several explanations and methodological details related to the figures remain unclear. For instance, it is not explained how the overall abundance of cell types in Figures 3D and 3E was calculated, how relative abundance was derived, or how these calculations were adjusted when split by proliferation status. In Table 2, it seems that model performance is reported using different node features (protein abundance or cell type), but the text in the second paragraph suggests that both were used simultaneously. This inconsistency is confusing. Additionally, the process for constructing the cell-cell contact graph, including how edges are defined, should be described more clearly.

    (3) The GNN performance appears modest. An AUROC of 0.71 can indicate meaningful predictive power for chemotherapy response, but it remains moderate. Including a baseline comparison would help contextualize the model's effectiveness. Furthermore, the reported value of 0.58 in Table 2 is relatively low, and its meaning or implication is not clearly explained.

    (4) Some methodological choices are not well justified. For example, the rationale for selecting the Self-Organizing Map (SOM) for clustering over other clustering methods is not discussed.

    (5) The manuscript would benefit from a more explicit discussion of how studies using IMC-based spatial analysis relate to or differ from those employing spatial transcriptomics, particularly in terms of their interpretability.

  3. Reviewer #2 (Public review):

    Summary:

    The current research presents an end-to-end computational workflow for large-scale Imaging Mass Cytometry (IMC) data and applies it to 813 regions of interest (ROIs) comprising over 4 million cells from 63 TNBC patients. The study integrates image preprocessing (IMC-Denoise and CLAHE), cell segmentation (Mesmer), phenotyping (Pixie), spatial neighborhood analysis (SquidPy), collagen feature extraction, and graph neural network (GNN) modeling to identify spatial-molecular determinants of chemotherapy response. The major observations include T-cell exclusion in non-responders, persistent fibroblast-macrophage co-localization post-therapy, and the identification of B7H4, CD11b, CD366, and FOXP3 as predictive markers via GNN explainability analysis. The work has been implemented on a rich dataset and integrated with spatial and molecular information. The manuscript is well written and addresses an important clinical question.

    Strengths:

    (1) The study analyzes 813 ROIs and over 4 million cells, which is an exceptionally large IMC dataset, and allows the authors to investigate spatial determinants of chemotherapy response in TNBC with considerably more statistical power than prior studies. It clearly shows an integrated spatial-proteomic analysis on a large IMC dataset.

    (2) The work reveals robust, conceptually meaningful tissue patterns with CD8+ T-cell exclusion from tumor regions in non-responders and increased fibroblast-macrophage spatial proximity that align with existing biological understanding of immunosuppressive microenvironments in TNBC. These findings highlight spatial organization, rather than simple cell abundance, as a key differentiator of treatment response.

    (3) Novel use of GNNs for chemoresponse prediction in IMC data helps in demonstrating that spatial and molecular features captured simultaneously can provide predictive information about treatment response. The use of GNNExplainer adds interpretability of the selected features, identifying immune-regulatory markers such as B7H4, CD366, FOXP3, and CD11b as contributors to chemoresponse heterogeneity.

    (4) The work complements emerging spatial transcriptomic analyses from the same SMART cohort and provides a scalable computational framework likely to be useful to other IMC and spatial-omics researchers.

    Weaknesses:

    (1) Some analytical components lack quantitative validation, limiting confidence in specific claims, such as CLAHE-based batch correction applied before segmentation are evaluated primarily through qualitative visualization rather than quantitative metrics. Similarly, the cell-type annotations produced via Pixie and manual thresholds lack independent validation, making it harder to assess the accuracy of downstream spatial and predictive analyses.

    (2) Predictive modeling performance is moderate and may be influenced by dataset structure; the GNN achieves AUROC ~0.71, which is meaningful but still limited, and the absence of external validation or multiple cross-validation strategies raises questions about generalizability. The predictive insights are promising but not yet sufficiently strong to support clinical decision-making.

    (3) Pre- and post-treatment comparisons are constrained to non-responders and pathologist-selected ROIs.

  4. Reviewer #3 (Public review):

    Summary:

    Luque et al. proposed stratifying chemotherapy response in triple-negative breast cancer based on spatial protein patterns from IMC data. This proposed method combines GNN with GNNexplainer to identify several important protein markers and cell types related to chemotherapy. As one of the most significant challenges in cancer research, this work holds great potential for translational medicine.

    Strengths:

    (1) Targeting the invention decision-making of TNBC, one of the prominent challenges in the field.

    (2) Cutting-edge spatial proteomics data with enough cohort and clinical outcome.

    (3) Appropriate usage of cutting-edge machine learning models and comprehensive analysis.

    Weaknesses:

    (1) More scientific rigor is needed for machine learning benchmarking.

    (2) More depth is needed, comparing related works with using similar approaches.