Predicting progression-free survival after systemic therapy in advanced head and neck cancer: Bayesian regression and model development

Curation statements for this article:
  • Curated by eLife

    eLife logo

    Evaluation Summary:

    Barber et al present a manuscript discussing predictive factors for chemotherapy efficacy in head and neck squamous cancer (HNSCC). The paper is well written, and its style/formatting are optimal. The baseline signature moderately predicted outcome, and the data after one cycle further improved the algorithm, though this decreases its utility as a pure predictive tool. It is interesting that a subpopulation of monocytes, a subset of white peripheral cells long suspected to correlate with outcomes in HNSCC was one of the key drivers of the algorithm. However the overall impact in the field of this work seems limited by a number of factors, including that the authors focused on immune cell subpopulations and exosomes, which narrows the scope (no cytokines or other biomarkers were included); the signatures were not prospectively validated on an independent cohort; the algorithm was developed around a first-line therapy that is no longer considered to be the standard of care for HNSCC; and, while most of the conclusions are supported by the data, some of the caveats (such as the lack of a validation cohort, key in predictive biomarker development), are not addressed.

    (This preprint has been reviewed by eLife. We include the public reviews from the reviewers here; the authors also receive private feedback with suggested changes to the manuscript. The reviewers remained anonymous to the authors.)

This article has been Reviewed by the following groups

Read the full article See related articles

Abstract

Advanced head and neck squamous cell carcinoma (HNSCC) is associated with a poor prognosis, and biomarkers that predict response to treatment are highly desirable. The primary aim was to predict progression-free survival (PFS) with a multivariate risk prediction model.

Methods:

Experimental covariates were derived from blood samples of 56 HNSCC patients which were prospectively obtained within a Phase 2 clinical trial (NCT02633800) at baseline and after the first treatment cycle of combined platinum-based chemotherapy with cetuximab treatment. Clinical and experimental covariates were selected by Bayesian multivariate regression to form risk scores to predict PFS.

Results:

A ‘baseline’ and a ‘combined’ risk prediction model were generated, each of which featuring clinical and experimental covariates. The baseline risk signature has three covariates and was strongly driven by baseline percentage of CD33 + CD14 + HLADR high monocytes. The combined signature has six covariates, also featuring baseline CD33 + CD14 + HLADR high monocytes but is strongly driven by on-treatment relative change of CD8 + central memory T cells percentages. The combined model has a higher predictive power than the baseline model and was successfully validated to predict therapeutic response in an independent cohort of nine patients from an additional Phase 2 trial (NCT03494322) assessing the addition of avelumab to cetuximab treatment in HNSCC. We identified tissue counterparts for the immune cells driving the models, using imaging mass cytometry, that specifically colocalized at the tissue level and correlated with outcome.

Conclusions:

This immune-based combined multimodality signature, obtained through longitudinal peripheral blood monitoring and validated in an independent cohort, presents a novel means of predicting response early on during the treatment course.

Funding:

Daiichi Sankyo Inc, Cancer Research UK, EU IMI2 IMMUCAN, UK Medical Research Council, European Research Council (335326), Merck Serono. Cancer Research Institute, National Institute for Health Research, Guy’s and St Thomas’ NHS Foundation Trust and The Institute of Cancer Research.

Clinical trial number:

NCT02633800 .

Article activity feed

  1. Author Response

    Evaluation Summary:

    1. The paper is well written, and its style/formatting are optimal. The baseline signature moderately predicted outcome, and the data after one cycle further improved the algorithm, though this decreases its utility as a pure predictive tool

    We thank the editor and the reviewers for their positive feedback regarding the style and formatting of the manuscript. We concur that longitudinal sampling of blood, before and after one cycle of treatment, renders the predictive signature marginally more laborious to generate. In an ideal setting, we would be able to solely generate a predictive signature based on baseline characteristics - unfortunately such a test does not yet exist.

    In this study, we propose adding an easily obtainable blood sample after the first cycle of treatment to significantly improve our ability to predict response. Due to the ease of sampling them, we believe that blood biopsies will be key as the search for predictive biomarkers expands. Since the inception of our study, there have been numerous impactful pieces of published literature assessing PBMCs, mainly in response to immune checkpoint blockade 1-6. Given that our risk signature is now validated in an immunotherapy trial (EACH trial NCT03494322), we are even more confident with our unique approach to longitudinal sampling to developing a predictive model to systemic therapy. The trial design of the validation study is now included as supplementary (Figure 2A) in the manuscript.

    1. Signatures were not prospectively validated on an independent cohort; the algorithm was developed around a first-line therapy that is no longer considered to be the standard of care for HNSCC; and, while most of the conclusions are supported by the data, some of the caveats (such as the lack of a validation cohort, key in predictive biomarker development), are not addressed.

    Thank you. We will address this comment in two parts – (a) with regards to the validation cohort part and (b) for the status of the EXTREME treatment regimen in the original cohort. In this revised version, we have validated our risk signature in an independent cohort of patients who received cetuximab and avelumab (anti-PD-L1) in a single-arm, phase 2 clinical trial setting. Beyond serving purely as a validation cohort, it also demonstrates the applicability of our model in predicting response to immune checkpoint blockade-based therapy in keeping with contemporary advances in systemic treatment for HNSCC. The risk signature strongly predicted response in the new independent cohort giving us more confidence in our model’s ability to predict outcome for systemic therapy regimens beyond cytotoxic chemotherapy and cetuximab. Figure 5B shows the strong correlation between the risk signature and disease outcome in the validation cohort (Kendall rank correlation, t=0.725 p=0.0181).

    Secondly, the EXTREME regimen (platinum/5-FU/cetuximab) remains a first-line standard of care treatment in the UK and European countries for HNSCC patients with negative PD-L1 status (CPS score <1) which account for around 15% of all HNSCC patients 7. While the US Food and Drug Administration (FDA) approved pembrolizumab in combination with chemotherapy as first-line treatment regardless of PD-L1 expression and pembrolizumab alone for patients with PD-L1-expressing tumours (CPS ≥1), the European Medicines Agency (EMA) approved pembrolizumab with or without chemotherapy only for patients with a CPS ≥1, and this has been highlighted in the European Society for Medical Oncology (ESMO) and the UK National Institute for Health and Care Excellence (NICE) guidelines 8 and (https://www.nice.org.uk/guidance/ta661/chapter/1-Recommendations).

    Furthermore, chemotherapy with EXTREME regimen is standard of care for patients with contraindications to immune checkpoint inhibitors such as autoimmune disease 8. It can also be considered as second-line treatment in patients who only received pembrolizumab monotherapy in the first line setting.

    1. However the overall impact in the field of this work seems limited by a number of factors, including that the authors focused on immune cell subpopulations and exosomes, which narrows the scope (no cytokines or other biomarkers were included).

    Thank you. We selected a finite number of covariates based on a few factors – (a) published literature, (b) previous data generated by the group and (c) the applicability of the findings to the clinic. Instead of an exploratory article in which we could generate an infinite number of covariates by a technique similar to RNA sequencing, we opted for a select set of covariates. This hypothesis-driven approach generated a strong signature that is now validated across two trials. The focus on immune population is driven by our hypothesis that systemic changes in the PBMCs are indicative and reflective of the status of the intra-tumoral immune response. In the revised manuscript we used a custom immune focused imaging mass cytometry antibody panel to probe tissue sections from 9 patients. We now show that the key populations driving the predictive model in the periphery are not only reflected at the tumoral level, but these disparate immune cell subpopulations also interact. See Figure 6 in which we use a machine learning approach to segment cells and assign them to distinct immunological subpopulations. We found that the peripheral monocyte population strongly correlated with a tumoral macrophage population having a similar marker expression pattern. We found that the peripheral central memory CD8 T cells inversely correlated with tissue resident memory T cells. The tissue presence of both these cells correlated positively with outcome. Most strikingly, these two populations were most likely to co-localize with each other at the tissue level at a frequency of almost double the second highest co-localization. Data on the nature of the interplay between peripheral systemic immunity and intra-tumoral immunity is novel and rarely exists in the literature outside the scope of in-vivo animal models. Here we describe these interactions using human patient samples treated with a clinically relevant therapy.

    Given the limited amount of patient sera collected in the trial we opted to perform exosome analysis on markers known to impact the response to the anti-EGFR/HER3 treatment/immune responses. This was in line with our labs work to use exosome FRET-FLIM as a surrogate for tissue FRET-FLIM which we originally used to discover a potential dimer dependent mechanism for anti-EGFR treatment resistance in neoadjuvant breast cancer patients9; and more recently published on a colorectal patient sample cohort from the COIN study 10. While exosome EGFR-HER3 heterodimer failed to reach significance in our risk signature, it was close as depicted in the Kaplan-Meier curve from Figure 3C. We of course acknowledge the potential added benefit of having serum cytokine array analysis. While that was not feasible for this study our group now aims at ensuring that extra patient serum samples are bio-banked for such analysis from ongoing and future trials.

    Reviewer 1 (Public Review):

    1. For this study to be significant, one would want to see a marked improvement over current biomarkers, in a robust and generalizable population. Unfortunately, this study falls short in these respects. First, the authors do not adequately discuss the prior literature. Even a fairly crude and old-fashioned blood-based biomarker such as neutrophil:lymphocyte ratio has quite good predictive and prognostic capability in R/M HNSCC

    Thank you for your suggestion. We have expanded the discussion to include an overview of current biomarkers. We also compared the predictive power of neutrophil:lymphocyte ratio (NLR) from two published meta-analysis to our risk signature 11,12. We used the median risk score to divide our original patient cohort into a high and low risk group. We then calculated the HRs and CI for both signatures at pre-treatment alone (HR = 4.1397 [95% CI: 1.975 - 8.676]) and for the combined signature (HR = 2.574 [95% CI: 1.336 - 4.96]). Both were higher than the published literature whilst only using the median as the cutoff. Mascarella, Mannard et al. published “NLR greater than the cutoff value was associated with poorer OS and DSS (HR 1.69; 95% CI 1.47-1.93; P < .001 and HR 1.88; 95% CI 1.20-2.95”, and Takenaka, Oya et al published : “The combined hazard ratio for OS in patients with an elevated NLR (range 2.04-5) was 1.78 (confidence interval [CI] 1.53-2.07”. We realize that we are stratifying patients based on PFS and not overall survival, which is an inherent limitation of the study, but the added preditive value of the signature relative to existing literature we humbly believe is too large to not be impacful.

    1. It is not clear to me that there is a compelling need to do better -- given that existing predictive biomarkers based on clinical nomograms or NLR are actually used in practice.

    We agree that clinical nomograms (based on clinicopathological factors) have been shown to be predictors of outcomes in HNSCC 13. However, whilst these models have been validated as prognostic biomarkers for overall survival and/or disease specific survival, they are not currently recommended in the cancer treatment guidelines nor universally used in the clinic. With the further validation performed on a cohort treated with an immune-checkpoint inhibitor, our multimodal signature describes new data to help understand the range of treatment responses and predict outcomes and could be used to guide treatment intensification, continuation and/or early termination in clinical practice or incorporated into future clinical trials. Moreover, in the resubmission we extend our work from predictive biomarker research to developing a better understanding of the interplay between the peripheral immune response to intra-tumoral immunity which we discuss in this letter as part of our response to the public evaluation summary part 3. Given the recent surge in literature focused on tumor immunity with the increased use of immune checkpoint blockers, we believe our work offers a strong contribution to the few papers in circulation that have attempted to link tumor immunity from the systemic level to the tumor tissue level.

    1. A large number (31 of 87) patients were not included due to lack of biomaterials. No analyses have been performed to examine the characteristics of these patients. It is unlikely that the collection of biomaterials has no correlation with disease characteristics, prognostic features, outcomes, or the analytes in this study. This exclusion -- akin to unequal censoring in clinical trials -- is likely to significant impact results. Given that the population enrolled in a phase II trial, and that sub-population of patients who survive long enough and are feeling well enough to submit to large volume blood draws on trial, would not necessarily represent the real world population of R/M HNSCC patients, a broader population is needed to justify conclusions about this assay having robust predictive value.

    We appreciate the reviewer’s concern on potential skewness of the data based on patient selection criteria. The median PFS of our 56-patient cohort used in the generation of the risk signature was 5.48 months as shown in supplementary table 1 in the original submission. This is in line with real-world treatment outcomes to the EXTREME Regimen (cetuximab with platinum-based therapy) as first line therapy for Recurrent/Metastatic Squamous Cell Carcinoma of the Head and Neck which was reported as 5 month by Sano et al in 2019 14. It is also very similar to the median PFS observed in the DIRECT study 15

    1. It is unclear why OS as a hard endpoint was not analyzed here. No explanation is provided, other than OS was not available, a statement that is difficult to understand, given that PFS was available, and overall survival is a component of PFS.

    Thank you. We admit that the absence of overall survival is an inherent limitation of the study. In the process of submitting this revision, we have once again requested this dataset from the sponsoring pharmaceutical company but were informed that they are unable to provide it. This is because reorganization of funding priorities within the company precludes them opening datasets from an already-published clinical trial. We are equally disappointed to not be able to obtain this data, but firmly believe that the ability of the signature to predict PFS (the primary endpoint of the trial, untainted by subsequent lines of treatment), as well as cross-validation against the contemporary EACH trial, is a testament to the signature’s strength.

    There is no validation set for the biomarker. The biomarker was trained and cross-validated using Bayesian techniques to reduce overfitting. This is a valid approach for training and cross-validation, but for the biomarker to be testable and interpretable, it requires assessment in an independent dataset. There is no statistical technique that I am aware of that generates informative biomarkers without an independent validation dataset

    We completely agree with the reviewer regarding the need to obtain a validation set. Obtaining patient samples from a similar cohort was difficult but we managed to validate the signature on a set of patients treated with an anti-PD-L1 monoclonal antibody in combination with cetuximab. Furthermore, the validation was performed using a limited numbers of covariates that were identified in the risk signature by the Bayesian model. These immune populations can be obtained by running a limited set of markers on flow cytometry. We were very happy to see that these limited immune based covariates strongly correlated with a worst disease response in an independent cohort using a different treatment modality. This furthers our hypothesis that changes in the immune populations are key to understanding response to systemic therapy. Fueled with the data from the validation cohort we furthered our analysis of the tissue from a total of 9 patients from the test cohort. Using imaging mass cytometry, we were able to identify how immune populations are mirrored at the tumoral level opening the horizon for new research. The data for the validation set are copied into this letter in response to point 2 of the public evaluation summary.

  2. Evaluation Summary:

    Barber et al present a manuscript discussing predictive factors for chemotherapy efficacy in head and neck squamous cancer (HNSCC). The paper is well written, and its style/formatting are optimal. The baseline signature moderately predicted outcome, and the data after one cycle further improved the algorithm, though this decreases its utility as a pure predictive tool. It is interesting that a subpopulation of monocytes, a subset of white peripheral cells long suspected to correlate with outcomes in HNSCC was one of the key drivers of the algorithm. However the overall impact in the field of this work seems limited by a number of factors, including that the authors focused on immune cell subpopulations and exosomes, which narrows the scope (no cytokines or other biomarkers were included); the signatures were not prospectively validated on an independent cohort; the algorithm was developed around a first-line therapy that is no longer considered to be the standard of care for HNSCC; and, while most of the conclusions are supported by the data, some of the caveats (such as the lack of a validation cohort, key in predictive biomarker development), are not addressed.

    (This preprint has been reviewed by eLife. We include the public reviews from the reviewers here; the authors also receive private feedback with suggested changes to the manuscript. The reviewers remained anonymous to the authors.)

  3. Reviewer #1 (Public Review):

    In this well-written manuscript by Barber and colleagues from UCL in the UK, the authors seek to identify a new predictive biomarker for patients with recurrent/metastatic head and neck cancer who are treated with chemotherapy. The manuscript is clearly written. This is an impressive body of correlative research performed in the context of samples collected from patients enrolled on a phase II trial, with samples collected and analyzed for immune monitoring. There are several novel assays employed beyond the standard immune monitoring. The question is of moderate clinical significance. There are a number of critical statistical limitations.

    The question is of moderate clinical significance to the field. It is correct that we have only modest predictive biomarkers for chemotherapy response in R/M HNSCC. For this study to be significant, one would want to see a marked improvement over current biomarkers, in a robust and generalizable population. Unfortunately, this study falls short in these respects. First, the authors do not adequately discuss the prior literature. Even a fairly crude and old-fashioned blood-based biomarker such as neutrophil:lymphocyte ratio has quite good predictive and prognostic capability in R/M HNSCC. It is not clear to me that there is a compelling need to do better -- given that existing predictive biomarkers based on clinical nomograms or NLR are actually used in practice.

    To establish that this fairly labor-intensive and expensive assay would add value, a comparison to other existing biomarkers is necessary. It is not clear qualitatively that the biomarker presented here is an improvement beyond what is currently available. This comparison could easily be performed.

    A large number (31 of 87) patients were not included due to lack of biomaterials. No analyses have been performed to examine the characteristics of these patients. It is unlikely that the collection of biomaterials has no correlation with disease characteristics, prognostic features, outcomes, or the analytes in this study. This exclusion -- akin to unequal censoring in clinical trials -- is likely to significant impact results. Given that the population enrolled in a phase II trial, and that sub-population of patients who survive long enough and are feeling well enough to submit to large volume blood draws on trial, would not necessarily represent the real world population of R/M HNSCC patients, a broader population is needed to justify conclusions about this assay having robust predictive value.

    It is unclear why OS as a hard endpoint was not analyzed here. No explanation is provided, other than OS was not available, a statement that is difficult to understand, given that PFS was available, and overall survival is a component of PFS.

    There is no validation set for the biomarker. The biomarker was trained and cross-validated using Bayesian techniques to reduce overfitting. This is a valid approach for training and cross-validation, but for the biomarker to be testable and interpretable, it requires assessment in an independent dataset. There is no statistical technique that I am aware of that generates informative biomarkers without an independent validation dataset, and the use of these techniques to minimize overfitting does not circumvent this limitation, if one's goal is to develop a clinically useful biomarker. The 2 articles cited to justify this approach are not germane to the question -- one is an article describing the FRET-FLIM technique, and the other article describes the effectiveness of this approach to minimize overfitting.

    In the end, the degree of predictive value, as assessed by C-index and the spread in the PFS curves, is modest, and not clearly an improvement beyond currently available biomarkers. Given that this dataset is the training dataset -- with no validation dataset -- in a population that is unlikely to be representative of the R/M population, it is not clear that this expensive and labor-intensive immune monitoring approach has much to offer.

  4. Reviewer #2 (Public Review):

    Barber et al present a manuscript discussing predictive factors for chemotherapy efficacy in head and neck squamous cancer (HNSCC). The paper is well written, and its style/formatting are optimal. The baseline signature moderately predicted outcome, and the data after one cycle further improved the algorithm, though this decreases its utility as a pure predictive tool. It is interesting that a subpopulation of monocytes, a subset of white peripheral cells long suspected to correlate with outcomes in HNSCC was one of the key drivers of the algorithm. However the overall impact in the field of this work seems limited by a number of factors, including that the authors focused on immune cell subpopulations and exosomes, which narrows the scope (no cytokines or other biomarkers were included); the signatures were not prospectively validated on an independent cohort; the algorithm was developed around a first-line therapy that is no longer considered to be the standard of care for HNSCC; and, while most of the conclusions are supported by the data, some of the caveats (such as the lack of a validation cohort, key in predictive biomarker development), are not addressed.