Decoding Liver Cancer Prognosis: From Multi-omics Subtypes, Prognostic Models to Single Cell Validation

Curation statements for this article:
  • Curated by eLife

    eLife logo

    eLife Assessment

    This important revised manuscript presents compelling findings by delineating two molecularly distinct liver cancer subtypes through comprehensive multi-omics integration and constructing a rigorously validated prognostic model. The authors have strengthened the analytical framework and validation across multiple datasets, including single-cell RNA sequencing. The evidence remains robust, with enhanced methodological clarity and expanded validation in both internal and independent cohorts. The revisions have improved the study's rigor and translational relevance.

This article has been Reviewed by the following groups

Read the full article See related articles

Abstract

Hepatocellular carcinoma (HCC) is a highly aggressive tumor characterized by significant heterogeneity and invasiveness, leading to a lack of precise individualized treatment strategies and poor patient outcomes. This necessitates the urgent development of accurate patient stratification methods and targeted therapies based on distinct tumor characteristics.By integrating gene expression data from The Cancer Genome Atlas (TCGA), International Cancer Genome Consortium (ICGC), and Gene Expression Omnibus (GEO), we identified subtypes through a multi-omics consensus clustering approach amalgamated from 10 clustering techniques. Subsequently, we developed a prognostic model, employing machine learning algorithms, based on subtype classification features. Finally, by analyzing single cell sequencing data, we investigated the mechanisms driving prognostic variations among distinct subtypes.First, we developed a novel consensus clustering method that categorizes liver cancer patients into two subtypes, CS1 and CS2. Second, we constructed a prognostic prediction model, which demonstrated superior predictive accuracy compared to several models published in the past five years. Finally, we observed differences between CS1 and CS2 in various metabolic pathways, biological processes, and signaling pathways, such as fatty acid metabolism, hypoxia levels, PI3K-AKT and MIF signaling pathway.

Article activity feed

  1. eLife Assessment

    This important revised manuscript presents compelling findings by delineating two molecularly distinct liver cancer subtypes through comprehensive multi-omics integration and constructing a rigorously validated prognostic model. The authors have strengthened the analytical framework and validation across multiple datasets, including single-cell RNA sequencing. The evidence remains robust, with enhanced methodological clarity and expanded validation in both internal and independent cohorts. The revisions have improved the study's rigor and translational relevance.

  2. Reviewer #1 (Public review):

    Summary:

    The authors aimed to classify hepatocellular carcinoma (HCC) patients into distinct subtypes using a comprehensive multi-omics approach. They employed an innovative consensus clustering method that integrates multiple omics data types, including mRNA, lncRNA, miRNA, DNA methylation, and somatic mutations. The study further sought to validate these subtypes by developing prognostic models using machine learning algorithms and extending the findings through single-cell RNA sequencing (scRNA-seq) to explore the cellular mechanisms driving subtype-specific prognostic differences.

    Strengths:

    (1) Comprehensive Data Integration: The study's integration of various omics data provides a well-rounded view of the molecular characteristics underlying HCC. This multi-omics approach is a significant strength, as it allows for a more accurate and detailed classification of cancer subtypes.

    (2) Innovative Methodology: The use of a consensus clustering approach that combines results from 10 different clustering algorithms is a notable methodological advancement. This approach reduces the bias that can result from relying on a single clustering method, enhancing the robustness of the findings.

    (3) Machine Learning-Based Prognostic Modeling: The authors rigorously apply a wide array of machine learning algorithms to develop and validate prognostic models, testing 101 different algorithm combinations. This comprehensive approach underscores the study's commitment to identifying the most predictive models, which is a considerable strength.

    (4) Validation Across Multiple Cohorts: The external validation of findings in independent cohorts is a critical strength, as it increases the generalizability and reliability of the results. This step is essential for demonstrating the clinical relevance of the proposed subtypes and prognostic models.

    Weaknesses:

    (1) Inconsistent Storyline:
    Despite the extensive data mining and rigorous methodologies, the manuscript suffers from a lack of a coherent and consistent narrative. The transition between different sections, particularly from multi-omics data integration to single-cell validation, feels disjointed. A clearer articulation of how each analysis ties into the overall research question would improve the manuscript.

    (2) Questionable Relevance of Immune Cell Activity Analysis:
    The evaluation of immune cell activities within the cancer cell model raises concerns about its meaningfulness. The methods used to assess immune function in the tumor microenvironment may not be fully appropriate, potentially limiting the insights gained from this part of the study.

    (3) Incomplete Single-Cell RNA-Seq Validation:
    The validation of the findings using single-cell RNA-seq data appears insufficient to fully support the study's claims. While the authors make an effort to extend their findings to the single-cell level, the analysis lacks depth. A more comprehensive validation is necessary to substantiate the robustness of the identified subtypes.

    (4) Figures and Visualizations:
    Several figures in the manuscript are missing necessary information, which affects the clarity of the results. For instance, the pathways in Figure 3A could be clustered to enhance interpretability, the blue bar in Figure 4A is unexplained, and Figure 4B is not discussed in the text. Additionally, the figure legend in Figure 7C lacks detail, and many figure descriptions merely repeat the captions without providing deeper insights.

    (5) Appraisal of the Study's Aims and Results
    The authors have set out to achieve an ambitious goal of classifying HCC patients into distinct prognostic subtypes and validating these findings through both bulk and single-cell analyses. While the methodologies employed are innovative and the data integration comprehensive, the study falls short in fully achieving its aims due to inconsistencies in the narrative and incomplete validation. The results partially support the conclusions, but the lack of coherence and depth in certain areas limits the overall
    impact of the study.

    (6) Impact on the Field
    If the identified weaknesses are addressed, this study has the potential to significantly impact the field of HCC research. The multi-omics approach combined with machine learning is a powerful framework that could set a new standard for cancer subtype classification. However, the current state of the manuscript leaves some uncertainty regarding the practical applicability of the findings, particularly in clinical settings.

    (7) Additional Context
    For readers and researchers, this study offers a valuable look into the potential of integrating multi-omics data with machine learning to improve cancer classification and prognostication. However, readers should be aware of the noted weaknesses, particularly the need for more consistent narrative development and comprehensive validation of the methods. Addressing these issues could greatly enhance the study's utility and relevance to the community.

    Comments on revisions:

    The authors have addressed the reviewers' concerns effectively.

  3. Author response:

    The following is the authors’ response to the original reviews

    Reviewer #1 (Recommendations for the authors):

    (1) Storyline and Narrative Flow:

    Consider revising the manuscript to create a more coherent and consistent narrative. Clarify how each section of the study-particularly the transition from multi-omics data integration to single-cell RNA-seq validation-contributes to the overall research question. This will help readers better understand the logical flow of the study.

    We thank the reviewer’s suggestion, which have highlighted the deficiencies in this area, and we have made appropriate modifications:

    We have modified some text, including the connections between different sections in the results part and the objectives and roles of various analyses in each section, thus enhancing the coherence between the contexts and clarifying the objectives and functions of each analysis, We believe this will help readers better understand the main content of the entire text.

    (2) Immune Cell Activity Analysis:

    Reevaluate the methods used to assess immune cell activities within the context of the tumor microenvironment. Consider providing additional justification for the relevance of using the cancer cell model for this analysis. If necessary, explore alternative methods or models that might offer more meaningful insights into immune-tumor interactions.

    We thank the reviewer’s suggestion, which have highlighted the deficiencies in this area, and we have made appropriate modifications:

    Using RNA-Bulk data, we evaluated the tumor immune microenvironment through various methods to assess immune infiltration levels and responses to immunotherapy. We found that the results were largely consistent with those presented in the manuscript, providing strong support for our viewpoints. We also acknowledge the limitations of findings from bioinformatics analysis. In our upcoming research, we plan to develop organoid models with gene expression patterns of both CS1 and CS2 subtypes, using these models as a foundation for studying the tumor immune microenvironment.

    (3) Single-Cell RNA-Seq Validation:

    Expand the validation of your findings using single-cell RNA-seq data. This could include more in-depth analyses that explore the heterogeneity within the subtypes and confirm the robustness of your classification method at the single-cell level. This would strengthen the support for your claims about the relevance of the identified subtypes.

    We thank the reviewer’s suggestion, which have highlighted the deficiencies in this area, and we have made appropriate modifications:

    In this manuscript, we employed the NTP algorithm to classify malignant cells identified by the CopyKAT algorithm using characteristic genes of CS1 and CS2 subtypes. This approach is similar to previous method that analyzed patients in the ICGC cohort with the same subtype genes. We consider this classification method valid.

    After classifying the malignant cells, we performed metabolic and cell communication analyses on the CS1 and CS2 subtype cells, revealing significant differences in biological pathways enriched by differential genes, metabolic levels, and cell signaling patterns. These differences align with variations observed in prior classifications and analyses based on RNA-Bulk data.

    We also acknowledge that validating the classification method solely with the single-cell dataset from this study is insufficient. We analyzed GSE202642 using the same processes and methods as GSE229772, finding that the results were generally consistent, indicating that our classification method exhibits a degree of robustness at the single-cell level.

    (4) Methodological Justification:

    Provide a more detailed rationale for the selection of machine learning algorithms and integration strategies used in the study. Explain why the chosen methods are particularly well-suited for this research, and discuss any potential limitations they might have.

    We thank the reviewer’s suggestion, which have highlighted the deficiencies in this area, and we have made appropriate modifications:

    We have updated the methodology section to enhance readers' understanding of the fundamental principles involved. This analysis has two key features: first, it combines 10 machine learning algorithms to generate 101 models and ultimately selects the prognostic prediction model with the highest C-index from these 101 algorithms; second, it utilizes the LOOCV method to analyze the training and validation sets. Compared to the conventional method of randomly dividing the training and validation sets by a fixed ratio, this approach significantly minimizes the bias and randomness introduced by the splitting process. Therefore, we believe this analysis can leverage the characteristic genes of the CS1 and CS2 subtypes, combined with existing clinical data from public databases, to yield results that are more accurate and reliable than the commonly used prognostic models in previous literature, such as COX regression and Lasso regression, as well as other individual algorithms. While this analysis presents advantages over some previous modeling methods, it is essential to recognize that it remains based on analyses conducted using public databases, which may obscure certain factors that might be clinically relevant to patient prognosis due to the mathematical logic of the algorithms.

    (5) Figures and Visualizations:

    Improve the clarity of your figures by addressing the following:

    a) Figure 3A: Cluster the pathways to make the comparisons clearer and more meaningful.

    b) Figure 4A: Clearly explain the significance of the blue bar.

    c) Figure 4B: Ensure this figure is discussed in the main text to justify its inclusion.

    d) Figure 7C: Enhance the figure legend to provide more informative details.

    Additionally, ensure that figure descriptions go beyond the captions and provide detailed explanations that help the reader understand the significance of each figure.

    We thank the reviewer’s suggestion, which have highlighted the deficiencies in this area, and we have made appropriate modifications:

    Figure 3A: We clustered the samples based on CS1 and CS2 subtypes and displayed the immune-related cell scores of each sample as a heatmap.

    Figure 4A: The blue bars in the figure represent the average C-index of this algorithm combination in the training dataset TCGA and the validation dataset ICGC, which we have supplemented in the corresponding sections of the text.

    Figure 4B: We described this figure in the results section, which primarily aims to validate whether our prognostic prediction model can predict patient outcomes in the TCGA cohort. The results showed that after performing prognostic risk scoring on patients based on the prediction model and categorizing them into high-risk and low-risk groups, the two groups exhibited significant prognostic differences, with the high-risk group showing worse outcomes compared to the low-risk group. This indicates that our prognostic prediction model can effectively distinguish the prognostic risk differences among patients in the TCGA-LIHC cohort. We also discussed these findings in the discussion section.

    Figure 7C: We used both point color and size to visualize the levels of metabolic scores, resulting in two dimensions in the legend, which actually represent the same information. Therefore, we removed the results that used point size to indicate the levels of metabolic scores.

    (6) Supplementary Materials:

    Consider including more detailed supplementary materials that provide additional validation data, extended methodological descriptions, and any other information that would support the robustness of your findings.

    We thank the reviewer’s suggestion, which have highlighted the deficiencies in this area, and we have made appropriate modifications:

    In the subsequent version of the record, we will upload the important results obtained during the research to GitHub, and in this revision, we have updated some figures that may better explain the results or the robustness of the findings as supplementary materials.

    (7) Recent Literature:

    a) Incorporate more recent studies in your discussion, especially those related to HCC subtypes and the application of machine learning in oncology. This will provide a more current context for your work and help position your findings within the broader field.

    We thank the reviewer’s suggestion, which have highlighted the deficiencies in this area, and we have made appropriate modifications:

    We have reviewed several studies related to HCC subtype classification and the application of machine learning in this field. In the discussion section, we summarize the significance and limitations of these studies. Additionally, we discuss the characteristics of our study in comparison to previous research in this field.

    (8) Data and Code Availability:

    Ensure that all data, code, and materials used in your study are made available in line with eLife's policies. Provide clear links to repositories where readers can access the data and code used in your analyses.

    We thank the reviewer’s suggestion, which have highlighted the deficiencies in this area, and we have made appropriate modifications:

    We have examined the relevant data, code, and materials. We confirm that we have indicated the sources of the data and tools used in the analysis within the manuscript. Moreover, these data and tools are accessible via the websites or references we have provided.

    Reviewer #2 (Recommendations for the authors):

    (1) While the computational findings are robust, further experimental validation of the two subtypes, particularly the role of the MIF signaling pathway, would strengthen the biological relevance of the findings. In vitro or in vivo validation could confirm the proposed mechanisms and their influence on patient prognosis.

    We thank the reviewer’s suggestion, which have highlighted the deficiencies in this area, and we have made appropriate modifications:

    We intend to verify our findings in future studies using tumor cell line models and animal models. We aim to identify and intervene with key molecules in the MIF signaling pathway. We will investigate how the MIF signaling pathway affects tumor sensitivity to treatment in both cell line and animal models, along with the underlying mechanisms.

    (2) Consider testing the model on additional independent cohorts beyond the TCGA and ICGC datasets to further demonstrate its generalizability and applicability across different patient populations.

    We thank the reviewer’s suggestion, which have highlighted the deficiencies in this area, and we have made appropriate modifications:

    We analyzed the GSE14520 study recorded in the GEO database, which uploaded a cohort consisting of 209 HCC patients and their corresponding RNA sequencing data. We validated the prognostic model obtained in this study using this cohort, and found that the model effectively distinguishes patients into high-risk and low-risk prognostic categories. Furthermore, there is a significant prognostic difference between the high-risk and low-risk patient groups. This is consistent with the results we obtained previously.

    (3) Review the manuscript for long or complex sentences, which can be broken down into shorter, more readable parts.

    We have made revisions to the long and complex sentences in the manuscript without compromising its academic integrity and rationality, with the hope that this will help readers better understand the content of this study.

    During the revision process, in addition to addressing the reviewer comments, we conducted a thorough review of the analysis. In the course of this review, we identified a few errors in the data usage and have since corrected the relevant data and figures:

    Figure 4: Due to space constraints, we adjusted the composition of the figures after incorporating the validation results from the GSE14520 dataset.

    Figure 5A: We rechecked the regression coefficients included in the model, updated several more recent prognostic models, and calculated the C-index for 20 prognostic models in the TCGA and ICGC cohorts using a method consistent with previous studies.

    Figure 5C-D: We adjusted the clarity of the figures.

    Figure 8: We reclassified the selected malignant cells and updated the subtypes results. Subsequently, based on the repeatedly confirmed typing results, we comprehensively updated the analysis results of the subsequent cell communication network construction, ensuring that the entire analysis process remains consistent with previous findings. We also adjusted the composition of the figure and presented the images that could not be conveniently merged due to space constraints as Figure 9.

  4. eLife Assessment

    This manuscript offers valuable insights by identifying two distinct liver cancer subtypes through multi-omics integration and developing a robust prognostic model, validated across various datasets, including single-cell RNA sequencing. The evidence is solid, with comprehensive validation in both internal and independent cohorts; however, the reliance on computational methods highlights the necessity for further experimental validation to fully confirm the mechanistic insights.

  5. Reviewer #1 (Public review):

    Summary:

    The authors aimed to classify hepatocellular carcinoma (HCC) patients into distinct subtypes using a comprehensive multi-omics approach. They employed an innovative consensus clustering method that integrates multiple omics data types, including mRNA, lncRNA, miRNA, DNA methylation, and somatic mutations. The study further sought to validate these subtypes by developing prognostic models using machine learning algorithms and extending the findings through single-cell RNA sequencing (scRNA-seq) to explore the cellular mechanisms driving subtype-specific prognostic differences.

    Strengths:

    (1) Comprehensive Data Integration: The study's integration of various omics data provides a well-rounded view of the molecular characteristics underlying HCC. This multi-omics approach is a significant strength, as it allows for more accurate and detailed classification of cancer subtypes.

    (2) Innovative Methodology: The use of a consensus clustering approach that combines results from 10 different clustering algorithms is a notable methodological advancement. This approach reduces the bias that can result from relying on a single clustering method, enhancing the robustness of the findings.

    (3) Machine Learning-Based Prognostic Modeling: The authors rigorously apply a wide array of machine learning algorithms to develop and validate prognostic models, testing 101 different algorithm combinations. This comprehensive approach underscores the study's commitment to identifying the most predictive models, which is a considerable strength.

    (4) Validation Across Multiple Cohorts: The external validation of findings in independent cohorts is a critical strength, as it increases the generalizability and reliability of the results. This step is essential for demonstrating the clinical relevance of the proposed subtypes and prognostic models.

    Weaknesses:

    (1) Inconsistent Storyline:
    Despite the extensive data mining and rigorous methodologies, the manuscript suffers from a lack of a coherent and consistent narrative. The transition between different sections, particularly from multi-omics data integration to single-cell validation, feels disjointed. A clearer articulation of how each analysis ties into the overall research question would improve the manuscript.

    (2) Questionable Relevance of Immune Cell Activity Analysis:
    The evaluation of immune cell activities within the cancer cell model raises concerns about its meaningfulness. The methods used to assess immune function in the tumor microenvironment may not be fully appropriate, potentially limiting the insights gained from this part of the study.

    (3) Incomplete Single-Cell RNA-Seq Validation:
    The validation of the findings using single-cell RNA-seq data appears insufficient to fully support the study's claims. While the authors make an effort to extend their findings to the single-cell level, the analysis lacks depth. A more comprehensive validation is necessary to substantiate the robustness of the identified subtypes.

    (4) Figures and Visualizations:
    Several figures in the manuscript are missing necessary information, which affects the clarity of the results. For instance, the pathways in Figure 3A could be clustered to enhance interpretability, the blue bar in Figure 4A is unexplained, and Figure 4B is not discussed in the text. Additionally, the figure legend in Figure 7C lacks detail, and many figure descriptions merely repeat the captions without providing deeper insights.

    (5) Appraisal of the Study's Aims and Results:
    The authors have set out to achieve an ambitious goal of classifying HCC patients into distinct prognostic subtypes and validating these findings through both bulk and single-cell analyses. While the methodologies employed are innovative and the data integration comprehensive, the study falls short of fully achieving its aims due to inconsistencies in the narrative and incomplete validation. The results partially support the conclusions, but the lack of coherence and depth in certain areas limits the overall impact of the study.

    (6) Impact on the Field:
    If the identified weaknesses are addressed, this study has the potential to significantly impact the field of HCC research. The multi-omics approach combined with machine learning is a powerful framework that could set a new standard for cancer subtype classification. However, the current state of the manuscript leaves some uncertainty regarding the practical applicability of the findings, particularly in clinical settings.

    (6) Additional Context
    For readers and researchers, this study offers a valuable look into the potential of integrating multi-omics data with machine learning to improve cancer classification and prognostication. However, readers should be aware of the noted weaknesses, particularly the need for more consistent narrative development and comprehensive validation of the methods. Addressing these issues could greatly enhance the study's utility and relevance to the community.

  6. Reviewer #2 (Public review):

    Summary:

    Overall, this is a well-executed and insightful study. With some refinement to the presentation and a deeper exploration of the implications, the manuscript will make a significant contribution to the field of cancer genomics and personalized medicine.

    Strengths:

    The manuscript integrates multi-omics data with machine learning to address the significant heterogeneity of hepatocellular carcinoma (HCC). The use of multiple clustering algorithms and a consensus method strengthens the robustness of the findings. The study successfully develops a prognostic model with excellent predictive accuracy, validated across independent datasets. This adds considerable value to the field, particularly in providing individualized treatment strategies. The identification of two distinct liver cancer subtypes with different biological and metabolic characteristics is well-supported by the data, offering a promising direction for personalized medicine.

    Weaknesses:

    (1) Consider streamlining the presentation of methods, especially regarding the clustering algorithms and machine learning models. Readers may find it difficult to follow the exact process unless more clearly outlined.

    (2) Some figures, such as the signaling pathways and heatmaps, are critical to understanding the study's findings. Ensure that all figures are high quality, easy to interpret, and adequately labeled. You may also want to highlight the key findings within the figure captions more explicitly.

    (3) While the manuscript does compare its prognostic model to those previously published, the novelty of the findings could be emphasized more clearly. Discussing the potential limitations of the study (e.g., the reliance on computational models and small sample sizes for scRNA-seq) could strengthen the manuscript.

    (4) The manuscript mentions that the data was split into training and validation datasets in a 1:1 ratio. How was the performance verified? Is there an independent test set?

    (5) The role of the MIF signaling pathway in subtype differentiation is intriguing, but further mechanistic insights into how this pathway drives the differences between CS1 and CS2 could be discussed in more detail. If experimental evidence for this pathway exists in the literature, it should be mentioned.

    (6) Some sentences are quite long and complex, which can affect readability. Breaking them down into shorter, clearer sentences would improve the flow.

  7. Author response:

    Reviewer #1 (Recommendations for the authors):

    (1) Storyline and Narrative Flow:

    Consider revising the manuscript to create a more coherent and consistent narrative. Clarify how each section of the study-particularly the transition from multi-omics data integration to single-cell RNA-seq validation-contributes to the overall research question. This will help readers better understand the logical flow of the study.

    In the upcoming revisions, we will optimize the logical connections between sections of the manuscript to clarify the role each part plays in the overall research question, making it easier for readers to follow.

    (2) Immune Cell Activity Analysis:

    Reevaluate the methods used to assess immune cell activities within the context of the tumor microenvironment. Consider providing additional justification for the relevance of using the cancer cell model for this analysis. If necessary, explore alternative methods or models that might offer more meaningful insights into immune-tumor interactions.

    We fully recognize the importance of using tumor models to analyze and validate immune activity results, and we are considering experimental research in this area in future projects.

    (3) Single-Cell RNA-Seq Validation:

    Expand the validation of your findings using single-cell RNA-seq data. This could include more in-depth analyses that explore the heterogeneity within the subtypes and confirm the robustness of your classification method at the single-cell level. This would strengthen the support for your claims about the relevance of the identified subtypes.

    In the current study, we have applied the obtained multi-omics profiling features to single-cell sequencing data to classify malignant cells. We analyzed the metabolic and cell communication differences between different subtypes of malignant cells and explored potential reasons for these differences. Next, we plan to conduct further analysis of the differences between malignant cell subtypes to identify additional clues and mechanisms underlying these variations.

    (4) Methodological Justification:

    Provide a more detailed rationale for the selection of machine learning algorithms and integration strategies used in the study. Explain why the chosen methods are particularly well-suited for this research, and discuss any potential limitations they might have.

    In the revised manuscript, we will include descriptions of the principles of these analytical methods, as well as examples of their application in other studies, to discuss the rationale and limitations of applying these methods in this research.

    (5) Figures and Visualizations:

    Improve the clarity of your figures by addressing the following:

    a) Figure 3A: Cluster the pathways to make the comparisons clearer and more meaningful.

    b) Figure 4A: Clearly explain the significance of the blue bar.

    c) Figure 4B: Ensure this figure is discussed in the main text to justify its inclusion.

    d) Figure 7C: Enhance the figure legend to provide more informative details.

    Additionally, ensure that figure descriptions go beyond the captions and provide detailed explanations that help the reader understand the significance of each figure.

    We fully agree with the reviewer’s suggestions regarding these figures, and we will make the necessary revisions in the revised manuscript.

    (6) Supplementary Materials:

    Consider including more detailed supplementary materials that provide additional validation data, extended methodological descriptions, and any other information that would support the robustness of your findings.

    When we submission the revised manuscript, we will include supplementary materials such as figures or tables that may enhance the presentation of the manuscript's completeness.

    (7) Recent Literature:

    a) Incorporate more recent studies in your discussion, especially those related to HCC subtypes and the application of machine learning in oncology. This will provide a more current context for your work and help position your findings within the broader field.

    We appreciate the reviewer's suggestion. We will incorporate more recent studies into the discussion section and optimize its content.

    (8) Data and Code Availability:

    Ensure that all data, code, and materials used in your study are made available in line with eLife's policies. Provide clear links to repositories where readers can access the data and code used in your analyses.

    We have indicated the sources of the data and tools used in the analysis process within the text, and these data and tools can be accessed through the websites or literature we have cited.

    Reviewer #2 (Recommendations for the authors):

    (1) While the computational findings are robust, further experimental validation of the two subtypes, particularly the role of the MIF signaling pathway, would strengthen the biological relevance of the findings. In vitro or in vivo validation could confirm the proposed mechanisms and their influence on patient prognosis.

    We fully recognize the importance of using tumor models to analyze and validate immune activity results, and we are considering experimental research in this area in future projects.

    (2) Consider testing the model on additional independent cohorts beyond the TCGA and ICGC datasets to further demonstrate its generalizability and applicability across different patient populations.

    We are considering looking for independent external datasets in the GEO database or other databases to validate our model.

    (3) Review the manuscript for long or complex sentences, which can be broken down into shorter, more readable parts.

    In the revised manuscript, we will address any grammatical issues present in the manuscript and modify long and complex sentences that may hinder reader comprehension.