Structural evolution of nitrogenase enzymes over geologic time
Curation statements for this article:-
Curated by eLife
eLife Assessment
This valuable study presents computational analyses of over 5,000 predicted extant and ancestral nitrogenase structures. The data analyses are convincing, it offers unique insights into the relationship between structural evolution and environmental and biological phenotypes. The data generated in this study provide a vast resource that can serve as a starting point for studies of reconstructed and extant nitrogenases.
This article has been Reviewed by the following groups
Listed in
- Evaluated articles (eLife)
Abstract
Abstract
Life on Earth is about 4 billion years old—nearly as old as the planet itself. Over this immense timespan, living systems and their biomolecules have both adapted to and driven profound changes in the Earth’s environment. Among these, certain critical enzymes emerged early and have persisted through planetary extremes. Here, we implement an integrated approach to investigate the structural evolution of nitrogenase, an ancient and globally essential enzyme responsible for biological nitrogen fixation. Despite the ecological diversity of its host microbes, nitrogenase retains strict functional constraints, including extreme oxygen sensitivity, high energy demands, and substrate availability. We combined phylogenetics, ancestral sequence reconstruction, protein crystallography and deep-learning based structural prediction to resurrect nearly three billion years of nitrogenase structural history. This effort represents the first effort to predict the full set of extant and ancestral structures along the evolutionary tree of a single enzyme, yielding over 5000 structural models. Our framework lays a foundation for reconstructing key structural constraints that shape protein evolution and examining ancient enzymes within the broader context of phylogenetic relationships and environmental transitions across geological timescales.
Article activity feed
-
eLife Assessment
This valuable study presents computational analyses of over 5,000 predicted extant and ancestral nitrogenase structures. The data analyses are convincing, it offers unique insights into the relationship between structural evolution and environmental and biological phenotypes. The data generated in this study provide a vast resource that can serve as a starting point for studies of reconstructed and extant nitrogenases.
-
Reviewer #1 (Public review):
This was a clearly written manuscript that did an excellent job summarizing complex data. In this manuscript, Cuevas-Zuviría et al. use protein modeling to generate over 5,000 predicted structures of nitrogenase components, encompassing both extant and ancestral forms across different clades. The study highlights that key insertions define the various Nif groups. The authors also examined the structures of three ancestral nitrogenase variants that had been previously identified and experimentally tested. These ancestral forms were shown in earlier studies to exhibit reduced activity in Azotobacter vinelandii, a model diazotroph.
-
Reviewer #2 (Public review):
Summary:
This work aims to study the evolution of nitrogenanses, understanding how their structure and function adapted to changes in environment, including oxygen levels and changes in metal availability.
The study predicts > 3000 structures of nitrogenases, corresponding to extant, ancestral and alternative ancestral sequences. It is observed that structural variations in the nitrogenases correlate with phylogenetic relationships. The amount of data generated in this study represents a massive and admirable undertaking. The study also provides strong insight into how structural evolution correlates with environmental and biological phenotypes.
-
Author response:
The following is the authors’ response to the previous reviews
Reviewer #1 (Public review):
Comments on revisions:
I appreciate the authors responding to my comments. I think Fig. S10 helps put the structural data into more context. It would be helpful to make clearer in the legend what proteins are being compared, especially in 10C.
Although I can see why the authors focus on the NifK extension and its potential connection to oxygen protection, I would point out that Vnf and Anf do not have this extension in their K subunit, and you find both Vnf and Anf in aerobic and facultative anaerobic diazotrophs. This is a minor point, but I think it is important to mention in the discussion.
We thank the reviewer for their thoughtful comments. We now added an additional line to the Discussion following their recommendation …
Author response:
The following is the authors’ response to the previous reviews
Reviewer #1 (Public review):
Comments on revisions:
I appreciate the authors responding to my comments. I think Fig. S10 helps put the structural data into more context. It would be helpful to make clearer in the legend what proteins are being compared, especially in 10C.
Although I can see why the authors focus on the NifK extension and its potential connection to oxygen protection, I would point out that Vnf and Anf do not have this extension in their K subunit, and you find both Vnf and Anf in aerobic and facultative anaerobic diazotrophs. This is a minor point, but I think it is important to mention in the discussion.
We thank the reviewer for their thoughtful comments. We now added an additional line to the Discussion following their recommendation and moved Figure S10 to main text.
Reviewer #2 (Public review):
Summary:
This work aims to study the evolution of nitrogenanses, understanding how their structure and function adapted to changes in environment, including oxygen levels and changes in metal availability.
The study predicts > 3000 structures of nitrogenases, corresponding to extant, ancestral and alternative ancestral sequences. It is observed that structural variations in the nitrogenases correlate with phylogenetic relationships. The amount of data generated in this study represents a massive and admirable undertaking. The study also provides strong insight into how structural evolution correlates with environmental and biological phenotypes.
We thank the reviewer for their summary and positive appraisal.
-
-
-
-
eLife Assessment
This valuable study presents computational analyses of over 5,000 predicted extant and ancestral nitrogenase structures. The data analyses are convincing, it offers unique insights into the relationship between structural evolution and environmental and biological phenotypes. The data generated in this study provide a vast resource that can serve as a starting point for studies of reconstructed and extant nitrogenases.
-
Reviewer #1 (Public review):
This was a clearly written manuscript that did an excellent job summarizing complex data. In this manuscript, Cuevas-Zuviría et al. use protein modeling to generate over 5,000 predicted structures of nitrogenase components, encompassing both extant and ancestral forms across different clades. The study highlights that key insertions define the various Nif groups. The authors also examined the structures of three ancestral nitrogenase variants that had been previously identified and experimentally tested. These ancestral forms were shown in earlier studies to exhibit reduced activity in Azotobacter vinelandii, a model diazotroph.
This work provides a useful resource for studying nitrogenase evolution. However, its impact is somewhat limited due to a lack of evidence linking the observed structural differences …
Reviewer #1 (Public review):
This was a clearly written manuscript that did an excellent job summarizing complex data. In this manuscript, Cuevas-Zuviría et al. use protein modeling to generate over 5,000 predicted structures of nitrogenase components, encompassing both extant and ancestral forms across different clades. The study highlights that key insertions define the various Nif groups. The authors also examined the structures of three ancestral nitrogenase variants that had been previously identified and experimentally tested. These ancestral forms were shown in earlier studies to exhibit reduced activity in Azotobacter vinelandii, a model diazotroph.
This work provides a useful resource for studying nitrogenase evolution. However, its impact is somewhat limited due to a lack of evidence linking the observed structural differences to functional changes. For example, in the ancestral nitrogenase structures, only a small set of residues (lines 421-431) were identified as potentially affecting interactions between nitrogenase components. Why didn't the authors test whether reverting these residues to their extant counterparts could improve nitrogenase activity of the ancestral variants?
Additionally, the paper feels somewhat disconnected. The predicted nitrogenase structures discussed in the first half of the manuscript were not well integrated with the findings from the ancestral structures. For instance, do the ancestral nitrogenase structures align with the predicted models? This comparison was never explicitly made and could have strengthened the study's conclusions.
Comments on revisions:
I appreciate the authors responding to my comments. I think Fig. S10 helps put the structural data into more context. It would be helpful to make clearer in the legend what proteins are being compared, especially in 10C.
Although I can see why the authors focus on the NifK extension and its potential connection to oxygen protection, I would point out that Vnf and Anf do not have this extension in their K subunit, and you find both Vnf and Anf in aerobic and facultative anaerobic diazotrophs. This is a minor point, but I think it is important to mention in the discussion.
-
Reviewer #2 (Public review):
Summary:
This work aims to study the evolution of nitrogenanses, understanding how their structure and function adapted to changes in environment, including oxygen levels and changes in metal availability.
The study predicts > 3000 structures of nitrogenases, corresponding to extant, ancestral and alternative ancestral sequences. It is observed that structural variations in the nitrogenases correlate with phylogenetic relationships. The amount of data generated in this study represents a massive and admirable undertaking. The study also provides strong insight into how structural evolution correlates with environmental and biological phenotypes
-
Author response:
The following is the authors’ response to the original reviews.
Reviewer #1 (Recommendations for the authors):
Line 122: There were a number of qualitative descriptors in the paper. For instance, if the authors want to say massive campaign, how massive? How rapid? These are relative terms in this context.
We have revised the text to minimize qualitative descriptors and to provide concrete numbers where possible. The revised sentence (line 121) now reads “We began our structural investigation of nitrogenase evolutionary history by conducting on a large-scale structure prediction analysis of 5378 protein structures, a more than threefold increase compared to available nitrogenase structures in the PDB. We then analyzed our phylogenetic dataset to identify notable structural changes.”
Line 179: "massively scale up" …
Author response:
The following is the authors’ response to the original reviews.
Reviewer #1 (Recommendations for the authors):
Line 122: There were a number of qualitative descriptors in the paper. For instance, if the authors want to say massive campaign, how massive? How rapid? These are relative terms in this context.
We have revised the text to minimize qualitative descriptors and to provide concrete numbers where possible. The revised sentence (line 121) now reads “We began our structural investigation of nitrogenase evolutionary history by conducting on a large-scale structure prediction analysis of 5378 protein structures, a more than threefold increase compared to available nitrogenase structures in the PDB. We then analyzed our phylogenetic dataset to identify notable structural changes.”
Line 179: "massively scale up" How massive?
We agree with the reviewer’s observation, in response, we have removed the phrase “massively scale up” and revised the text.
Line 182: "no compromise on alignment depth and negligible cost to prediction accuracy". How do you know this? Is this shown somewhere? Was there a comparison between known structures and the predicted structure for those nitrogenases that have structures?
In response to this comment, we have made several clarifications and revisions in the manuscript:
We modified Figure S1, which now shows the pLDDT (per-residue confidence metric from Alphafold) values of all our predictions. These scores are consistently high (over 90 for the D and K subunits, and approximetly 90 for the H subunits) regardless of whether the recycling protocol or the bona-fide protocol was used.
The reviewer’s comment demonstrated to us that the Figure S1 needed to more clearly representing these values, we therefore updated it accordingly.
To prevent any misinterpretation of our claims about the accuracy and cost of the method , we have revised the text at line 179, as follows:
“In total, 2,689 unique extant and ancestral nitrogenase variants were targeted. All structures were generated in approximately 805 hours, including GPU computations and MMseqs2 alignments performed using two different protocols: one for extant or most likely ancestral sequences, and another for ancestral variants.”
To support our analyses further, Figure S10A compares our model predictions with available PDB structures for nitrogenases.
Additionally, Figure S10B compare our predicted structures with the experimental structures reported in this article. In all cases, we observe low RMSD values.
Line 220: "fall within 2 angstroms" instead of "fall 2A"?
We have updated it in the text.
Line 315: It is not clear how the binding affinities and other measurements in Figure 4 and S6C were measured, and it is not discussed in the material and methods.
We thank the reviewer for pointing out this lack of clarity. The binding affinity estimations were performed using Prodigy. We have updated the main text (see line 322) to explicitly state that binding affinities were estimated using Prodigy. In addition, we have expanded the Materials and Methods section to include additional information about the structure characterization methods (lines 745-749). Previously, these details were only noted in Supplementary Table S6.
Line 510-511: "Subtle, modular structural adjustments away from the active site were key to the evolution and persistence of nitrogenases over geologic time". This seems like a bit of an overstatement. While the authors see structural differences in the ancestral nitrogenase and speculate these differences could be involved in oxygen protection, there is no evidence that the ancestral nitrogenase is more sensitive to oxygen than the extant nitrogenase.
We appreciate the reviewer’s comment. Our intention was to emphasize that subtle, modular structural adjustments might have contributed to oxygen protection rather than to assert that ancestral nitrogenases are more oxygen-sensitive than their extant counterparts. We have revised the text to clarify.
Reviewer #2 (Recommendations for the authors):
What is the reference for the measured RMSDs in Fig 2A? What is the value on the y-axis? The range of 'Count' is unclear, given that there are 5000 structures predicted in the study.
Figure 2A presents a histogram of RMSD values from all pairwise alignments among 769 structures (385 extant and 384 ancestral DDKK), totaling 591,361 comparisons. We excluded ancestral DDKK variants due to computational limitations.
Similarly, what is the sequence identity in Figure 2B calculated relative to?
In Figure 2B, sequence identities are derived from pairwise comparisons across all structures in our dataset. Each value represents the identity between two specific structures, rather than being measured against a single reference.
The claim that 'structural analysis could reproduce sequence-based phylogenetic variation' should probably be tempered or qualified, given that the RMSD differences calculated are so low.
We hope to have addressed the concerns about the low RMSD values in the previous comments. We have revised the text (line 204), which now reads: “it still strongly correlates with sequence identity (Figure 2B), indicating that even minor structural variations can recapitulate sequence-based phylogenetic distinctions.”
How are binding affinities (Figure 4) calculated?
We have now clarified the binding affinity calculations in the main text. The model used is now detailed at line 322, with additional information provided in the Methods section.
Presumably, crystallized proteins (Anc1A, Anc1B, Anc2) were also among those whose structures were predicted with AF. A comparison should be provided of the predicted and crystallized structures, as this is an excellent opportunity to further comment on the reliability of AlphaFold.
In the revised manuscript, Figure S10 now present structural comparisons between the crystallized proteins and their AlphaFold-predicted counterparts.
The labels in Figure 5B are not clear. Are the 3rd and 4th panels also comparative RMSD values? But only one complex name is provided.
We appreciate this feedback and now revised the Figure 5B for clarity.
Page 9 line 220, missing word: 'varaints fall within/under 2angstroms'
We thank the reviewer for the correction, we have updated the text.
-
Author response:
Public Reviews:
Reviewer #1 (Public review):
This was a clearly written manuscript that did an excellent job summarizing complex data.
In this manuscript, Cuevas-Zuviría et al. use protein modeling to generate over 5,000 predicted structures of nitrogenase components, encompassing both extant and ancestral forms across different clades. The study highlights that key insertions define the various Nif groups. The authors also examined the structures of three ancestral nitrogenase variants that had been previously identified and experimentally tested. These ancestral forms were shown in earlier studies to exhibit reduced activity in Azotobacter vinelandii, a model diazotroph. This work provides a useful resource for studying nitrogenase evolution.
However, its impact is somewhat limited due to a lack of evidence linking …
Author response:
Public Reviews:
Reviewer #1 (Public review):
This was a clearly written manuscript that did an excellent job summarizing complex data.
In this manuscript, Cuevas-Zuviría et al. use protein modeling to generate over 5,000 predicted structures of nitrogenase components, encompassing both extant and ancestral forms across different clades. The study highlights that key insertions define the various Nif groups. The authors also examined the structures of three ancestral nitrogenase variants that had been previously identified and experimentally tested. These ancestral forms were shown in earlier studies to exhibit reduced activity in Azotobacter vinelandii, a model diazotroph. This work provides a useful resource for studying nitrogenase evolution.
However, its impact is somewhat limited due to a lack of evidence linking the observed structural differences to functional changes. For example, in the ancestral nitrogenase structures, only a small set of residues (lines 421-431) were identified as potentially affecting interactions between nitrogenase components. Why didn't the authors test whether reverting these residues to their extant counterparts could improve nitrogenase activity of the ancestral variants?
We thank the reviewer for their thoughtful comments. We acknowledge that our current study is primarily focused on a computational exploration of the structural differences in both extant and ancestral nitrogenase variants, which allowed us to generate a comprehensive structural dataset. Although we did not carry out experimental reversion tests in this study, we agree that directly assessing the functional consequences of reverting the specific residues (lines 420 to 429) to their extant counterparts is an important next step to elucidate their functional role. Indeed, these findings provide a valuable foundation for our future work, which is designed to include experimental characterization of these variants and further elucidate the role of critical residues in nitrogenase activity and evolution. We believe that these experiments will offer the direct functional validation that the reviewer has rightly pointed out, and we look forward to reporting on these results in a future study.
Additionally, the paper feels somewhat disconnected. The predicted nitrogenase structures discussed in the first half of the manuscript were not well integrated with the findings from the ancestral structures. For instance, do the ancestral nitrogenase structures align with the predicted models? This comparison was never explicitly made and could have strengthened the study's conclusions.
We thank the reviewer for this suggestion. Our original analysis (previously shown in Figure S9, now Figure S10) included insights into structural align comparisons. In response, we have reorganized the results section (lines 351-355) to explicitly address this comparison.
Reviewer #2 (Public review):
This work aims to study the evolution of nitrogenases, understanding how their structure and function adapted to changes in the environment, including oxygen levels and changes in metal availability. The study predicts > 5000 structures of nitrogenases, corresponding to extant, ancestral, and alternative ancestral sequences. It is observed that structural variations in the nitrogenases correlate with phylogenetic relationships. The amount of data generated in this study represents a massive undertaking that is certain to be a resource for the community. The study also provides strong insight into how structural evolution correlates with environmental and biological phenotypes.
The challenge with this study is that all (or nearly all) of the quantitative analyses presented are based on RMSD calculations, many of which are under 2 angstroms. For all intents and purposes, two structures with RMSD < 2 angstroms could be considered 'structurally identical'. A lot of insight generated is based on minuscule differences in RMSD, for which it is not clear that they are significantly different. The suggestion would be to find a way to evaluate the RMSD metric and determine whether these values, as obtained for structures being compared, are reliable. Some options are provided in earlier studies: PMID: 11514933, PMID: 17218333, PMID: 11420449, PMID: 8289285 (and others). It could also be valuable to focus more on site-specific RMSDs rather than Global RMSDs. The high conservation in the nitrogenases likely ensures that the global RMSDs will remain low across the family. Focusing on specific regions might reveal interesting differences between clades that are more informative regarding the evolution of structure in tandem with environment/time.
We thank the reviewer for their suggestions. We agree that while global RMSD values below 2Å typically indicate high structural similarity, relying solely on these measures can mask subtle yet potentially functionally meaningful differences. Our aim was not to test for overall structural identity but rather to quantify fine-scale variations between highly conserved nitrogenase structures, including extant and ancestral variants. Nevertheless, in light of the reviewer’s suggestions, we have implemented an additional metric ( rmsd100) for a more nuanced comparison. The results of our additional analyses (Figure S3) align closely with our original results (Figure 2), supporting our decision to retain the un-normalized results in the main text. As an additional measure, we also computed site-specific RMSDs for the active site’s environments (Figure S6) to further delineate subtle structural variations.
-
eLife Assessment
This useful study presents computational analyses of over 5,000 predicted extant and ancestral nitrogenase structures. While the data and some analyses are solid, the study remains incomplete in demonstrating that the metrics used for comparing nitrogenase structures are statistically rigorous. The data generated in this study provide a vast resource that can serve as a starting point for functional studies of reconstructed and extant nitrogenases.
-
Reviewer #1 (Public review):
This was a clearly written manuscript that did an excellent job summarizing complex data. In this manuscript, Cuevas-Zuviría et al. use protein modeling to generate over 5,000 predicted structures of nitrogenase components, encompassing both extant and ancestral forms across different clades. The study highlights that key insertions define the various Nif groups. The authors also examined the structures of three ancestral nitrogenase variants that had been previously identified and experimentally tested. These ancestral forms were shown in earlier studies to exhibit reduced activity in Azotobacter vinelandii, a model diazotroph.
This work provides a useful resource for studying nitrogenase evolution. However, its impact is somewhat limited due to a lack of evidence linking the observed structural differences …
Reviewer #1 (Public review):
This was a clearly written manuscript that did an excellent job summarizing complex data. In this manuscript, Cuevas-Zuviría et al. use protein modeling to generate over 5,000 predicted structures of nitrogenase components, encompassing both extant and ancestral forms across different clades. The study highlights that key insertions define the various Nif groups. The authors also examined the structures of three ancestral nitrogenase variants that had been previously identified and experimentally tested. These ancestral forms were shown in earlier studies to exhibit reduced activity in Azotobacter vinelandii, a model diazotroph.
This work provides a useful resource for studying nitrogenase evolution. However, its impact is somewhat limited due to a lack of evidence linking the observed structural differences to functional changes. For example, in the ancestral nitrogenase structures, only a small set of residues (lines 421-431) were identified as potentially affecting interactions between nitrogenase components. Why didn't the authors test whether reverting these residues to their extant counterparts could improve nitrogenase activity of the ancestral variants?
Additionally, the paper feels somewhat disconnected. The predicted nitrogenase structures discussed in the first half of the manuscript were not well integrated with the findings from the ancestral structures. For instance, do the ancestral nitrogenase structures align with the predicted models? This comparison was never explicitly made and could have strengthened the study's conclusions.
-
Reviewer #2 (Public review):
This work aims to study the evolution of nitrogenanses, understanding how their structure and function adapted to changes in the environment, including oxygen levels and changes in metal availability.
The study predicts > 5000 structures of nitrogenases, corresponding to extant, ancestral, and alternative ancestral sequences. It is observed that structural variations in the nitrogenases correlate with phylogenetic relationships. The amount of data generated in this study represents a massive undertaking that is certain to be a resource for the community. The study also provides strong insight into how structural evolution correlates with environmental and biological phenotypes.
The challenge with this study is that all (or nearly all) of the quantitative analyses presented are based on RMSD calculations, …
Reviewer #2 (Public review):
This work aims to study the evolution of nitrogenanses, understanding how their structure and function adapted to changes in the environment, including oxygen levels and changes in metal availability.
The study predicts > 5000 structures of nitrogenases, corresponding to extant, ancestral, and alternative ancestral sequences. It is observed that structural variations in the nitrogenases correlate with phylogenetic relationships. The amount of data generated in this study represents a massive undertaking that is certain to be a resource for the community. The study also provides strong insight into how structural evolution correlates with environmental and biological phenotypes.
The challenge with this study is that all (or nearly all) of the quantitative analyses presented are based on RMSD calculations, many of which are under 2 angstroms. For all intents and purposes, two structures with RMSD < 2 angstroms could be considered 'structurally identical'. A lot of insight generated is based on minuscule differences in RMSD, for which it is not clear that they are significantly different. The suggestion would be to find a way to evaluate the RMSD metric and determine whether these values, as obtained for structures being compared, are reliable. Some options are provided in earlier studies: PMID: 11514933, PMID: 17218333, PMID: 11420449, PMID: 8289285 (and others).
It could also be valuable to focus more on site-specific RMSDs rather than Global RMSDs. The high conservation in the nitrogenases likely ensures that the global RMSDs will remain low across the family. Focusing on specific regions might reveal interesting differences between clades that are more informative regarding the evolution of structure in tandem with environment/time.
-
-