Investigating the native functions of [NiFe]-carbon monoxide dehydrogenases through genomic context analysis

Curation statements for this article:
  • Curated by eLife

    eLife logo

    eLife Assessment

    This study presents a valuable analysis of a large dataset of [NiFe]-CODHs, integrating genomic context, operon organization, and clade-specific gene neighborhoods to discern patterns of functional diversification and adaptation. Carefully looking at the CODH genomic context, e.g., CODH-HCP co-occurrence, the authors gain insight into enzymatic activity, biotechnological potential, and differential functional roles. The approach aligns with current standards in genomic enzymology to characterize newly identified enzymes. With solid support, this work provides a broadly informative contribution to the field.

This article has been Reviewed by the following groups

Read the full article See related articles

Discuss this preprint

Start a discussion What are Sciety discussions?

Abstract

Carbon monoxide dehydrogenases containing nickel-iron active sites ([NiFe]-CODHs) catalyze the reversible oxidation of CO to CO2, representing key targets for biocatalytic CO2 reduction. Despite dramatic differences in catalytic rates and O2 tolerance between CODH variants, the molecular basis for this functional diversity remains poorly understood. We applied comparative genomics and synteny analysis to investigate the biochemical roles of CODH clades A-F using 1,376 CODH and 1,545 hybrid cluster protein sequences. Around ∼30% of genomes encode multiple CODH isoforms. Analysis revealed distinct gene clustering patterns correlating with biochemical function. Clades A, E, and F exhibit a degree of distributional exclusivity. Clades C and D frequently co-occur with active CODHs, suggesting auxiliary roles. Operon architecture analysis revealed functional specialization: clade A links to acetyl-CoA synthase; clades A, E, F contain essential maturation machinery (CooC, CooJ, CooT) correlating with catalytic activity; clade B associates with transporters; clade C with electron transfer partners; clade D with transcriptional regulators. High CODH-HCP co-occurrence (except clade A) suggests environmental interdependency. These findings establish clades A, E, F as primary biocatalyst targets while defining regulatory functions for clades C, D, providing a genomics framework for predicting CODH phenotypes.

Article activity feed

  1. eLife Assessment

    This study presents a valuable analysis of a large dataset of [NiFe]-CODHs, integrating genomic context, operon organization, and clade-specific gene neighborhoods to discern patterns of functional diversification and adaptation. Carefully looking at the CODH genomic context, e.g., CODH-HCP co-occurrence, the authors gain insight into enzymatic activity, biotechnological potential, and differential functional roles. The approach aligns with current standards in genomic enzymology to characterize newly identified enzymes. With solid support, this work provides a broadly informative contribution to the field.

  2. Reviewer #1 (Public review):

    Summary:

    This manuscript analyzes a large dataset of [NiFe]-CODHs with a focus on genomic context and operon organization. Beyond earlier phylogenetic and biochemical studies, it addresses CODH-HCP co-occurrence, clade-specific gene neighborhoods, and operon-level variation, offering new perspectives on functional diversification and adaptation.

    Strengths:

    The study has a valuable approach.

    Weaknesses:

    Several points should be addressed.

    (1) The rationale for excluding clades G and H should be clarified. Inoue et al. (Extremophiles 26:9, 2022) defined [NiFe]-CODH phylogenetic clades A-H. In the present manuscript, clades A-H are depicted, yet the analyses and discussion focus only on clades A-F. If clades G and H were deliberately excluded (e.g., due to limited sequence data or lack of biochemical evidence), the rationale should be clearly stated. Providing even a brief explanation of their status or the reason for omission would help readers understand the scope and limitations of the study. In addition, although Figure 1 shows clades A-H and cites Inoue et al. (2022), the manuscript does not explicitly state how these clades are defined. An explicit acknowledgement of the clade framework would improve clarity and ensure that readers fully understand the basis for subsequent analyses.

    (2) The co-occurrence data would benefit from clearer presentation in the supplementary material. At present, the supplementary data largely consist of raw values, making interpretation difficult. For example, in Figure 3b, the co-occurrence frequencies are hard to reconcile with the text: clade A shows no co-occurrence with clade B and even lower tendencies than clades E or F, while clade E appears relatively high. Similarly, the claim that clades C and D "more often co-occur, especially with A, E, and F" does not align with the numerical trends, where D and E show stronger co-occurrence but C does not. A concise, well-organized summary table would greatly improve clarity and prevent such misunderstandings.

    (3) The rationale for analyzing gene neighborhoods at the single-operon level needs clarification. Many microorganisms encode more than one CODH operon, yet the analysis was carried out at the level of individual operons. The authors should clarify the biological rationale for this choice and discuss how focusing on single operons rather than considering the full complement per organism might affect the interpretation of genomic context.

  3. Reviewer #2 (Public review):

    The authors present a comparative genomic and phylogenetic analysis aimed at elucidating the functions of nickel-dependent carbon monoxide dehydrogenases (Ni-CODHs) and hybrid-cluster proteins (HCPs). By examining gene neighborhoods, phylogenetic relationships, and co-occurrence patterns, they propose functional hypotheses for different CODH clades and highlight those with the greatest potential for biotechnological applications.

    A major strength of this work lies in its systematic and conceptually clear approach, which provides a rapid and low-cost framework for predicting the functional potential of newly identified CODHs based on sequence data and genomic context. The analysis is careful in minimizing false positives and offers valuable insights into the diversity and distribution of CODH enzyme clades.

    However, several limitations should be considered when interpreting the findings. The use of incomplete genome assemblies may lead to the exclusion of relevant genes or operonic regions. Clade H was omitted due to a lack of information on its host, and the number of class II HCPs included is limited. Although the genomic window analyzed is relatively broad, it may still miss functionally relevant neighboring genes. The study assumes that the pathways associated with CODHs are encoded near the enzyme loci, but these could also occur elsewhere in the genome or on the complementary strand. The authors acknowledge these and other limitations clearly and thoughtfully, which strengthens the transparency and credibility of their analysis.

    Given the high evolutionary diversity of CODHs-both across and within clades-phenotypic predictions derived solely from sequence and neighborhood data should be interpreted with caution. Sequence-based searches, while specific, may have limited sensitivity, and structural homology searches could further enrich the dataset. Additionally, the visual inspection used to filter out non-CODH sequences is not described in detail, leaving uncertainty about reproducibility. The generalization of enzymatic activity or inactivity from a few characterized examples to entire clades should also be regarded as tentative.
    Despite these limitations, the study presents a solid and valuable methodological framework that can aid in the rapid functional screening of novel CODH enzymes and may inspire broader applications in enzyme discovery and metabolic annotation.