Gene interaction perturbation network deciphers a high-resolution taxonomy in colorectal cancer

Curation statements for this article:
  • Curated by eLife

    eLife logo

    Evaluation Summary:

    Liu et al. describes an unsupervised method that clusters colorectal cancer samples based on perturbations to gene interactions. They show that this method strongly suggests 6 distinct clusters of samples and identifies phenotypes associated with the clusters, including survival, drug response, immune phenotype, response to immune checkpoint inhibitors and perturbed pathways. This is an interesting and significant manuscript, which has been well conducted.

    (This preprint has been reviewed by eLife. We include the public reviews from the reviewers here; the authors also receive private feedback with suggested changes to the manuscript. Reviewer #1 agreed to share their name with the authors.)

This article has been Reviewed by the following groups

Read the full article See related articles

Abstract

Molecular subtypes of colorectal cancer (CRC) are currently identified via the snapshot transcriptional profiles, largely ignoring the dynamic changes of gene expressions. Conversely, biological networks remain relatively stable irrespective of time and condition. Here, we introduce an individual-specific gene interaction perturbation network-based (GIN) approach and identify six GIN subtypes (GINS1-6) with distinguishing features: (i) GINS1 (proliferative, 24%~34%), elevated proliferative activity, high tumor purity, immune-desert, PIK3CA mutations, and immunotherapeutic resistance; (ii) GINS2 (stromal-rich, 14%~22%), abundant fibroblasts, immune-suppressed, stem-cell-like, SMAD4 mutations, unfavorable prognosis, high potential of recurrence and metastasis, immunotherapeutic resistance, and sensitive to fluorouracil-based chemotherapy; (iii) GINS3 ( KRAS -inactivated, 13%~20%), high tumor purity, immune-desert, activation of EGFR and ephrin receptors, chromosomal instability (CIN), fewer KRAS mutations, SMOC1 methylation, immunotherapeutic resistance, and sensitive to cetuximab and bevacizumab; (iv) GINS4 (mixed, 10%~19%), moderate level of stromal and immune activities, transit-amplifying-like, and TMEM106A methylation; (v) GINS5 (immune-activated, 12%~24%), stronger immune activation, plentiful tumor mutation and neoantigen burden, microsatellite instability and high CpG island methylator phenotype, BRAF mutations, favorable prognosis, and sensitive to immunotherapy and PARP inhibitors; (vi) GINS6, (metabolic, 5%~8%), accumulated fatty acids, enterocyte-like, and BMP activity. Overall, the novel high-resolution taxonomy derived from an interactome perspective could facilitate more effective management of CRC patients.

Article activity feed

  1. Author Response

    Reviewer #1 (Public Review):

    Liu et. al. applied an existing method to study the subtypes of CRC from a network perspective. In the proposed framework, the authors calculated the perturbation of expression-rank differences of predefined network edges in both tumor and normal samples. By clustering the derived perturbation scores in CRC tumors using publicly available gene expression datasets, they reported six subtypes (referred to as GINS 1-6) and then focused on the association of each subtype with clinical features and known molecular mechanisms and cell phenotypes. My recommendation is major revision.

    Major concerns:

    (1) While this study originates from the network-perspective, it is unclear to me if the new subtypes provide key novel insights into the gene regulatory mechanisms for the development of CRC. For example, the "Biological peculiarities of six subtypes" section is descriptive and lacks a punch point.

    Thanks for your professional suggestions. In this study, we focused on the global network perturbations instead of snapshot transcriptional profiles, because snapshot transcriptional profiles largely ignore the dynamic changes of gene expressions in a biological system, and conversely, biological networks remain relatively stable irrespective of time and condition. In this perturbation network, we only use the global network perturbation matrix to perform consensus clustering, rather than for exploring the gene regulatory mechanisms of each subtype. However, subtype-related studies tend to investigate the biological characteristics of each subtype.

    Thus, we then delineate the biological attributes inherent to GINS subtypes using two different algorithms (SSEA and GSVA). These works were done to understand the underlying biological characteristics of these subtypes and define them biologically, similar to previous subtype studies (PMID: 31563503; 31875970; 26457759; 30833271; 30842092; 32164750; 30837276). As you commented, the section is descriptive and lacks a punch point. Hence, we highlighted potential transformation among GINS2/4/5. In this study, GINS2 was endowed with higher stromal activity and lower immune activity, whereas GINS5 conveyed the opposite trend entirely, concordant with the tumor invasiveness and prognosis of two subtypes, and GINS4 was characterized by a mixed phenotype that displayed moderate level of stromal and immune pathways. As three subtypes with abundant TME components, GINS2/4/5 may mutually evolve in stromal and immune functions. Thus, we intended to extract consistently upregulated and downregulated genes among these three subtypes, using Mfuzz package, a noise-robust soft clustering analysis with the fuzzy c-means form(Kumar and M, 2007). The Mfuzz analysis revealed 10 gene clusters, and gene cluster 3 and 10 displayed the stable expression pattern from GINS2 to GINS5 (Figure 5C and Supplementary File 8). As expected, gene cluster 3 was prevailingly associated with immune infiltration and activation (Figure 5D), whereas gene cluster 10 was prominently characterized by stromal activation and remodeling (Figure 5E), which further supported our findings. This also indicated that TME had profound impacts on the progression and prognosis of tumors, and GINS2/5 acted as two extremes of TME components, indeed showing diametrically opposite clinical outcomes (Red mark in “Biological peculiarities of six subtypes” part). Subsequently, we further investigate the immune regulations of GINS subfamilies. We found that GINS5 was also characterized by higher immune infiltration and stronger immunogenicity based on the transcriptome and proteome analysis. For example, GINS5 harbored remarkably higher tumor mutation burden (TMB) and neoantigen load (NAL) (P <0.001, Figure 6C), possibly further inducing abundant immune elements and regulations. GINS5 also possessed the abundant infiltration of Th1, Th2, and M1 macrophages(Mills et al., 2016) (Figure 6-figure supplement 1A-C), which could secrete proinflammatory cytokines and enhance immune activation. Conversely, M2 traditionally regarded as promoting tumor growth by suppressing cell-mediated immunity and subsequent cancer cell killing(Mills et al., 2016), was significantly elevated in GINS2 (Figure 6-figure supplement 1D). In line with this, three other classical immunosuppressive cells, including fibroblasts, myeloid-derived suppressor cells (MDSC), and Treg cells(Hicks et al., 2022), were also significantly enriched in GINS2.

    Additionally, in the “GINS6 tumors conveyed rich lipid metabolisms” part, we further observed that lipid metabolisms were the most significant metabolic processes in GINS6. Metabolomics analysis suggested that GINS6 exhibited higher levels in four fatty acids including α-glycerophosphate, adipate, taurocholate, and aconitate. These findings validated that GINS6 was closely associated with metabolic reprogramming and accumulated fatty acids.

    Overall, it is difficult to profoundly investigate the underlying biological mechanisms of all subtypes in a paragraph, so we first used the ‘Hallmark’ genesets to preliminarily explore the biological characteristics of these subtypes, thus giving us inspiration and direction for further exploration, in fact the following studies in this part are refinement and deepening of this part.

    Thank you for your academic discussion with us.

    (2) To further demonstrate the novelty of the identified subtypes, the authors need to show the additional benefit of the GINS1-6 to patient stratification derived from existing methods, such as integrative clustering based on multiple genomic evidence (copy number alterations, gene expression and somatic mutations).

    Thanks for your thoughtful comments. We wanted to clarify this issue from the following three aspects:

    1. First of all, the basis that inspired us is that the global network perturbations have advantages over snapshot transcriptional profiles (main traditional methods in CRC), because snapshot transcriptional profiles largely ignore the dynamic changes of gene expressions in a biological system, and conversely, biological networks remain relatively stable irrespective of time and condition. The gene interactions in a biological network are overall stable in a particular type of normal human tissue but widely perturbed in diseased tissues (PMID: 29040359 and 25165092). These perturbations in gene interactions (edge perturbations) in each sample can be measured by the change in the relative gene expression value. The edge perturbations at an individual level can be used to characterize the perturbation of the biological network for each sample efficiently. Thus, this is the starting point for cancer clustering in this study.

    2. Second, the essence of molecular clustering is to investigate tumor heterogeneity. In order to detect multiple subtypes (some of which may represent relatively small fractions of the patient population) (PMID: 23584089), the clustering methods require moderately large numbers of samples – more than contained in any one of the individual CRC data sets published to date. With that in mind, we began our analysis by identifying suitable and comparable microarray datasets (n=2167, Supplementary File 15). The sample number in our discovery dataset is the largest among the current CRC subtype-related studies. For multi-omics clustering, there is currently no multi-omics sequencing cohort with a large number of samples and good sequencing quality, only the TCGA-CRC cohort has eligible multi-omics data (only less than 300 patients with multi-omics data). Therefore, subtypes represent relatively small fractions of the patient population cannot be detected.

    3. Third, we actually tested several methods and datasets before determining GINS subtypes. Clustering always divides tumors into several subgroups, but we expect these subgroups to reproduce in other cohorts. Thus, we need to validate the robustness of our subtypes in multiple independent cohorts. Our validation works focused on the following four contexts: (1) data from the same platform (GPL570); (2) data from different platforms and sequencing techniques (microarray or RNA-seq); (3) microdissected or whole tumors; (4) in-house clinical setting. However, as mentioned above, only TCGA-CRC has data, so a rigorous verification cannot be carried out, so more rigorous verification cannot be carried out.

    Thank you for your academic discussion with us.

  2. Evaluation Summary:

    Liu et al. describes an unsupervised method that clusters colorectal cancer samples based on perturbations to gene interactions. They show that this method strongly suggests 6 distinct clusters of samples and identifies phenotypes associated with the clusters, including survival, drug response, immune phenotype, response to immune checkpoint inhibitors and perturbed pathways. This is an interesting and significant manuscript, which has been well conducted.

    (This preprint has been reviewed by eLife. We include the public reviews from the reviewers here; the authors also receive private feedback with suggested changes to the manuscript. Reviewer #1 agreed to share their name with the authors.)

  3. Reviewer #1 (Public Review):

    Liu et. al. applied an existing method to study the subtypes of CRC from a network perspective. In the proposed framework, the authors calculated the perturbation of expression-rank differences of predefined network edges in both tumor and normal samples. By clustering the derived perturbation scores in CRC tumors using publicly available gene expression datasets, they reported six subtypes (referred to as GINS 1-6) and then focused on the association of each subtype with clinical features and known molecular mechanisms and cell phenotypes. My recommendation is major revision.

    Major concerns:

    (1) While this study originates from the network-perspective, it is unclear to me if the new subtypes provide key novel insights into the gene regulatory mechanisms for the development of CRC. For example, the "Biological peculiarities of six subtypes" section is descriptive and lacks a punch point.

    (2) To further demonstrate the novelty of the identified subtypes, the authors need to show the additional benefit of the GINS1-6 to patient stratification derived from existing methods, such as integrative clustering based on multiple genomic evidence (copy number alterations, gene expression and somatic mutations).

  4. Reviewer #2 (Public Review):

    Liu et al. describes an unsupervised method that clusters colorectal cancer samples based on perturbations to gene interactions. They show that this method strongly suggests 6 distinct clusters of samples and identifies phenotypes associated with the clusters, including survival, drug response, immune phenotype, response to immune checkpoint inhibitors and perturbed pathways. The validation works seem to be nice and comprehensive using four contexts: (1) data from the same platform (GPL570); (2) data from different platforms and sequencing techniques (microarray or RNA-seq); (3) microdissected or whole tumors; (4) in-house clinical setting. The attributes of the six subtypes (hallmarks, immune regulations, cellular phenotypes. etc.) were explored and described synthetically and analyzed according to their inherent characters based on the suitability of clinical treatment. For instance, the GINS2 and can benefit more from ACT, while GINS6 is more enriched with lipid metabolism thus might be more sensitive to metabolic inhibitors. This provides not only pathology insights to CRC, but also promising clinical values.

    Overall, it is a well-written manuscript and presents interesting results, and the author's magnificent work is highly commendable, they showed a wealth of interesting approaches and results.

  5. Reviewer #3 (Public Review):

    The authors have constructed a large-scale interaction perturbation network from 2,167 CRC tissues and 308 normal tissues, deciphering six GINS subtypes with particular clinical and molecular peculiarities. In addition, the GINS taxonomy was rigorously validated in 19 external datasets (n =3,420) with distinct conditions. From an interactome perspective, this study identified and diversely validated a high-resolution classification system, which could confidently serve as an ideal tool for optimizing decision-making for CRC patients. The multifariously biological and clinical peculiarities of GINS taxonomy improve the understanding of CRC heterogeneity and facilitate clinical stratification and individuation management. Additionally, candidate specific-subtype agents provide more targeted or combined interventions for six subtypes, which also need to be validated in clinical settings.