Decoding Liver Cancer Prognosis: From Multi-omics Subtypes, Prognostic Models to Single Cell Validation

Curation statements for this article:
  • Curated by eLife

    eLife logo

    eLife Assessment

    This manuscript offers valuable insights by identifying two distinct liver cancer subtypes through multi-omics integration and developing a robust prognostic model, validated across various datasets, including single-cell RNA sequencing. The evidence is solid, with comprehensive validation in both internal and independent cohorts; however, the reliance on computational methods highlights the necessity for further experimental validation to fully confirm the mechanistic insights.

This article has been Reviewed by the following groups

Read the full article See related articles

Abstract

Purpose

Hepatocellular carcinoma (HCC) is a highly aggressive tumor characterized by significant heterogeneity and invasiveness, leading to a lack of precise individualized treatment strategies and poor patient outcomes. This necessitates the urgent development of accurate patient stratification methods and targeted therapies based on distinct tumor characteristics.

Experimental Design

By integrating gene expression data from The Cancer Genome Atlas (TCGA), International Cancer Genome Consortium (ICGC), and Gene Expression Omnibus (GEO), we identified subtypes through a multi-omics consensus clustering approach amalgamated from 10 clustering techniques. Subsequently, we developed a prognostic model, employing machine learning algorithms, based on subtype classification features. Finally, by analyzing single cell sequencing data, we investigated the mechanisms driving prognostic variations among distinct subtypes.

Results

First, we developed a novel consensus clustering method that categorizes liver cancer patients into two subtypes, CS1 and CS2. Second, we constructed a prognostic prediction model, which demonstrated superior predictive accuracy compared to several models published in the past five years. Finally, we observed differences between CS1 and CS2 in various metabolic pathways, biological processes, and signaling pathways, such as fatty acid metabolism, hypoxia levels, and the PI3K-AKT signaling pathway.

Article activity feed

  1. eLife Assessment

    This manuscript offers valuable insights by identifying two distinct liver cancer subtypes through multi-omics integration and developing a robust prognostic model, validated across various datasets, including single-cell RNA sequencing. The evidence is solid, with comprehensive validation in both internal and independent cohorts; however, the reliance on computational methods highlights the necessity for further experimental validation to fully confirm the mechanistic insights.

  2. Reviewer #1 (Public review):

    Summary:

    The authors aimed to classify hepatocellular carcinoma (HCC) patients into distinct subtypes using a comprehensive multi-omics approach. They employed an innovative consensus clustering method that integrates multiple omics data types, including mRNA, lncRNA, miRNA, DNA methylation, and somatic mutations. The study further sought to validate these subtypes by developing prognostic models using machine learning algorithms and extending the findings through single-cell RNA sequencing (scRNA-seq) to explore the cellular mechanisms driving subtype-specific prognostic differences.

    Strengths:

    (1) Comprehensive Data Integration: The study's integration of various omics data provides a well-rounded view of the molecular characteristics underlying HCC. This multi-omics approach is a significant strength, as it allows for more accurate and detailed classification of cancer subtypes.

    (2) Innovative Methodology: The use of a consensus clustering approach that combines results from 10 different clustering algorithms is a notable methodological advancement. This approach reduces the bias that can result from relying on a single clustering method, enhancing the robustness of the findings.

    (3) Machine Learning-Based Prognostic Modeling: The authors rigorously apply a wide array of machine learning algorithms to develop and validate prognostic models, testing 101 different algorithm combinations. This comprehensive approach underscores the study's commitment to identifying the most predictive models, which is a considerable strength.

    (4) Validation Across Multiple Cohorts: The external validation of findings in independent cohorts is a critical strength, as it increases the generalizability and reliability of the results. This step is essential for demonstrating the clinical relevance of the proposed subtypes and prognostic models.

    Weaknesses:

    (1) Inconsistent Storyline:
    Despite the extensive data mining and rigorous methodologies, the manuscript suffers from a lack of a coherent and consistent narrative. The transition between different sections, particularly from multi-omics data integration to single-cell validation, feels disjointed. A clearer articulation of how each analysis ties into the overall research question would improve the manuscript.

    (2) Questionable Relevance of Immune Cell Activity Analysis:
    The evaluation of immune cell activities within the cancer cell model raises concerns about its meaningfulness. The methods used to assess immune function in the tumor microenvironment may not be fully appropriate, potentially limiting the insights gained from this part of the study.

    (3) Incomplete Single-Cell RNA-Seq Validation:
    The validation of the findings using single-cell RNA-seq data appears insufficient to fully support the study's claims. While the authors make an effort to extend their findings to the single-cell level, the analysis lacks depth. A more comprehensive validation is necessary to substantiate the robustness of the identified subtypes.

    (4) Figures and Visualizations:
    Several figures in the manuscript are missing necessary information, which affects the clarity of the results. For instance, the pathways in Figure 3A could be clustered to enhance interpretability, the blue bar in Figure 4A is unexplained, and Figure 4B is not discussed in the text. Additionally, the figure legend in Figure 7C lacks detail, and many figure descriptions merely repeat the captions without providing deeper insights.

    (5) Appraisal of the Study's Aims and Results:
    The authors have set out to achieve an ambitious goal of classifying HCC patients into distinct prognostic subtypes and validating these findings through both bulk and single-cell analyses. While the methodologies employed are innovative and the data integration comprehensive, the study falls short of fully achieving its aims due to inconsistencies in the narrative and incomplete validation. The results partially support the conclusions, but the lack of coherence and depth in certain areas limits the overall impact of the study.

    (6) Impact on the Field:
    If the identified weaknesses are addressed, this study has the potential to significantly impact the field of HCC research. The multi-omics approach combined with machine learning is a powerful framework that could set a new standard for cancer subtype classification. However, the current state of the manuscript leaves some uncertainty regarding the practical applicability of the findings, particularly in clinical settings.

    (6) Additional Context
    For readers and researchers, this study offers a valuable look into the potential of integrating multi-omics data with machine learning to improve cancer classification and prognostication. However, readers should be aware of the noted weaknesses, particularly the need for more consistent narrative development and comprehensive validation of the methods. Addressing these issues could greatly enhance the study's utility and relevance to the community.

  3. Reviewer #2 (Public review):

    Summary:

    Overall, this is a well-executed and insightful study. With some refinement to the presentation and a deeper exploration of the implications, the manuscript will make a significant contribution to the field of cancer genomics and personalized medicine.

    Strengths:

    The manuscript integrates multi-omics data with machine learning to address the significant heterogeneity of hepatocellular carcinoma (HCC). The use of multiple clustering algorithms and a consensus method strengthens the robustness of the findings. The study successfully develops a prognostic model with excellent predictive accuracy, validated across independent datasets. This adds considerable value to the field, particularly in providing individualized treatment strategies. The identification of two distinct liver cancer subtypes with different biological and metabolic characteristics is well-supported by the data, offering a promising direction for personalized medicine.

    Weaknesses:

    (1) Consider streamlining the presentation of methods, especially regarding the clustering algorithms and machine learning models. Readers may find it difficult to follow the exact process unless more clearly outlined.

    (2) Some figures, such as the signaling pathways and heatmaps, are critical to understanding the study's findings. Ensure that all figures are high quality, easy to interpret, and adequately labeled. You may also want to highlight the key findings within the figure captions more explicitly.

    (3) While the manuscript does compare its prognostic model to those previously published, the novelty of the findings could be emphasized more clearly. Discussing the potential limitations of the study (e.g., the reliance on computational models and small sample sizes for scRNA-seq) could strengthen the manuscript.

    (4) The manuscript mentions that the data was split into training and validation datasets in a 1:1 ratio. How was the performance verified? Is there an independent test set?

    (5) The role of the MIF signaling pathway in subtype differentiation is intriguing, but further mechanistic insights into how this pathway drives the differences between CS1 and CS2 could be discussed in more detail. If experimental evidence for this pathway exists in the literature, it should be mentioned.

    (6) Some sentences are quite long and complex, which can affect readability. Breaking them down into shorter, clearer sentences would improve the flow.

  4. Author response:

    Reviewer #1 (Recommendations for the authors):

    (1) Storyline and Narrative Flow:

    Consider revising the manuscript to create a more coherent and consistent narrative. Clarify how each section of the study-particularly the transition from multi-omics data integration to single-cell RNA-seq validation-contributes to the overall research question. This will help readers better understand the logical flow of the study.

    In the upcoming revisions, we will optimize the logical connections between sections of the manuscript to clarify the role each part plays in the overall research question, making it easier for readers to follow.

    (2) Immune Cell Activity Analysis:

    Reevaluate the methods used to assess immune cell activities within the context of the tumor microenvironment. Consider providing additional justification for the relevance of using the cancer cell model for this analysis. If necessary, explore alternative methods or models that might offer more meaningful insights into immune-tumor interactions.

    We fully recognize the importance of using tumor models to analyze and validate immune activity results, and we are considering experimental research in this area in future projects.

    (3) Single-Cell RNA-Seq Validation:

    Expand the validation of your findings using single-cell RNA-seq data. This could include more in-depth analyses that explore the heterogeneity within the subtypes and confirm the robustness of your classification method at the single-cell level. This would strengthen the support for your claims about the relevance of the identified subtypes.

    In the current study, we have applied the obtained multi-omics profiling features to single-cell sequencing data to classify malignant cells. We analyzed the metabolic and cell communication differences between different subtypes of malignant cells and explored potential reasons for these differences. Next, we plan to conduct further analysis of the differences between malignant cell subtypes to identify additional clues and mechanisms underlying these variations.

    (4) Methodological Justification:

    Provide a more detailed rationale for the selection of machine learning algorithms and integration strategies used in the study. Explain why the chosen methods are particularly well-suited for this research, and discuss any potential limitations they might have.

    In the revised manuscript, we will include descriptions of the principles of these analytical methods, as well as examples of their application in other studies, to discuss the rationale and limitations of applying these methods in this research.

    (5) Figures and Visualizations:

    Improve the clarity of your figures by addressing the following:

    a) Figure 3A: Cluster the pathways to make the comparisons clearer and more meaningful.

    b) Figure 4A: Clearly explain the significance of the blue bar.

    c) Figure 4B: Ensure this figure is discussed in the main text to justify its inclusion.

    d) Figure 7C: Enhance the figure legend to provide more informative details.

    Additionally, ensure that figure descriptions go beyond the captions and provide detailed explanations that help the reader understand the significance of each figure.

    We fully agree with the reviewer’s suggestions regarding these figures, and we will make the necessary revisions in the revised manuscript.

    (6) Supplementary Materials:

    Consider including more detailed supplementary materials that provide additional validation data, extended methodological descriptions, and any other information that would support the robustness of your findings.

    When we submission the revised manuscript, we will include supplementary materials such as figures or tables that may enhance the presentation of the manuscript's completeness.

    (7) Recent Literature:

    a) Incorporate more recent studies in your discussion, especially those related to HCC subtypes and the application of machine learning in oncology. This will provide a more current context for your work and help position your findings within the broader field.

    We appreciate the reviewer's suggestion. We will incorporate more recent studies into the discussion section and optimize its content.

    (8) Data and Code Availability:

    Ensure that all data, code, and materials used in your study are made available in line with eLife's policies. Provide clear links to repositories where readers can access the data and code used in your analyses.

    We have indicated the sources of the data and tools used in the analysis process within the text, and these data and tools can be accessed through the websites or literature we have cited.

    Reviewer #2 (Recommendations for the authors):

    (1) While the computational findings are robust, further experimental validation of the two subtypes, particularly the role of the MIF signaling pathway, would strengthen the biological relevance of the findings. In vitro or in vivo validation could confirm the proposed mechanisms and their influence on patient prognosis.

    We fully recognize the importance of using tumor models to analyze and validate immune activity results, and we are considering experimental research in this area in future projects.

    (2) Consider testing the model on additional independent cohorts beyond the TCGA and ICGC datasets to further demonstrate its generalizability and applicability across different patient populations.

    We are considering looking for independent external datasets in the GEO database or other databases to validate our model.

    (3) Review the manuscript for long or complex sentences, which can be broken down into shorter, more readable parts.

    In the revised manuscript, we will address any grammatical issues present in the manuscript and modify long and complex sentences that may hinder reader comprehension.