Early stage NSCLS patients’ prognostic prediction with multi-information using transformer and graph neural network model

Curation statements for this article:
  • Curated by eLife

    eLife logo

    Evaluation Summary:

    This work presents a new model that leverages imaging and non-imaging data for the prediction of the survival of patients with early-stage NSCLC. The new model sought to demonstrate the roles of imaging and non-imaging features in determining high-risk nodes within the graph neural network, and the results have the potential of broad interest to clinicians within the field of cancer and have a high value towards clinical application.

    (This preprint has been reviewed by eLife. We include the public reviews from the reviewers here; the authors also receive private feedback with suggested changes to the manuscript. The reviewer remained anonymous to the authors.)

This article has been Reviewed by the following groups

Read the full article See related articles

Abstract

We proposed a population graph with Transformer-generated and clinical features for the purpose of predicting overall survival (OS) and recurrence-free survival (RFS) for patients with early stage non-small cell lung carcinomas and to compare this model with traditional models.

Methods:

The study included 1705 patients with lung cancer (stages I and II), and a public data set for external validation (n=127). We proposed a graph with edges representing non-imaging patient characteristics and nodes representing imaging tumour region characteristics generated by a pretrained Vision Transformer. The model was compared with a TNM model and a ResNet-Graph model. To evaluate the models' performance, the area under the receiver operator characteristic curve (ROC-AUC) was calculated for both OS and RFS prediction. The Kaplan–Meier method was used to generate prognostic and survival estimates for low- and high-risk groups, along with net reclassification improvement (NRI), integrated discrimination improvement (IDI), and decision curve analysis. An additional subanalysis was conducted to examine the relationship between clinical data and imaging features associated with risk prediction.

Results:

Our model achieved AUC values of 0.785 (95% confidence interval [CI]: 0.716–0.855) and 0.695 (95% CI: 0.603–0.787) on the testing and external data sets for OS prediction, and 0.726 (95% CI: 0.653–0.800) and 0.700 (95% CI: 0.615–0.785) for RFS prediction. Additional survival analyses indicated that our model outperformed the present TNM and ResNet-Graph models in terms of net benefit for survival prediction.

Conclusions:

Our Transformer-Graph model was effective at predicting survival in patients with early stage lung cancer, which was constructed using both imaging and non-imaging clinical features. Some high-risk patients were distinguishable by using a similarity score function defined by non-imaging characteristics such as age, gender, histology type, and tumour location, while Transformer-generated features demonstrated additional benefits for patients whose non-imaging characteristics were non-discriminatory for survival outcomes.

Funding:

The study was supported by the National Natural Science Foundation of China (91959126, 8210071009), and Science and Technology Commission of Shanghai Municipality (20XD1403000, 21YF1438200).

Article activity feed

  1. Evaluation Summary:

    This work presents a new model that leverages imaging and non-imaging data for the prediction of the survival of patients with early-stage NSCLC. The new model sought to demonstrate the roles of imaging and non-imaging features in determining high-risk nodes within the graph neural network, and the results have the potential of broad interest to clinicians within the field of cancer and have a high value towards clinical application.

    (This preprint has been reviewed by eLife. We include the public reviews from the reviewers here; the authors also receive private feedback with suggested changes to the manuscript. The reviewer remained anonymous to the authors.)

  2. Reviewer #1 (Public Review):

    The manuscript by Lian et al. presents a population graph deep learning model constructed using Transformer-generated imaging features and non-imaging clinical characteristics that were proven to be effective at predicting the survival of patients with early-stage NSCLC. This study demonstrates GNN-based model significantly outperforms the TNM model and ResNet-Graph model in predicting survival in all datasets. The paper is well-written, clear for a general audience, takes nice innovations in computer vision into the medical field, and presents a usable tool for survival analysis. The strengths and limitations of the approach are brought forth in the discussion.

  3. Reviewer #2 (Public Review):

    Lian et al., present a new model that leverages imaging and non-imaging data for the prediction of the survival of patients with early-stage NSCLC. In particular, the group demonstrated the feasibility of using Vision Transformer on CT images of the lung tumour to generate features for cancer survival analysis. The authors also used a graph structure to embed patients' imaging and non-imaging clinical data separately in the graph neural network and attempted to explain how clinical data communicates with Transformer-generated imaging features for survival analysis. The study included 1705 patients with lung cancer (stages I and II), and a public dataset for external validation (n=127). Additional survival analyses indicated that the new model outperformed the present TNM and ResNet-Graph models in terms of net benefit for survival prediction. Most current prognostic prediction methods have focused on either imaging data or non-imaging clinical data such as sex, age, and disease history, but it is different to combine that information. The new model demonstrated the ability to combine non-imaging clinical features with imaging features in an understandable manner and sought to understand the roles of imaging and non-imaging features in determining high-risk nodes within the graph neural network, thus presenting both novelty and high quality. However, the patients' demographics were different in terms of age, cancer staging, gender, ethnicity, treatment, and follow-up strategies have an impact on the prediction performance. When increased clinical data are available, the model may produce more informative signatures.