Ancestral reconstruction of duplicated signaling proteins reveals the evolution of signaling specificity

Curation statements for this article:
  • Curated by eLife

    eLife logo

    Evaluation Summary:

    This manuscript is of interest to protein biochemists, protein engineers, and those interested in molecular evolution. The computation and experiments presented in this paper are very logical and rigorously performed. The results provide an example of how protein interaction specificity can be rewired using a small number of mutations, in the context of ancestral sequence reconstruction.

    (This preprint has been reviewed by eLife. We include the public reviews from the reviewers here; the authors also receive private feedback with suggested changes to the manuscript. Reviewer #1 agreed to share their name with the authors.)

This article has been Reviewed by the following groups

Read the full article See related articles

Abstract

Gene duplication is crucial to generating novel signaling pathways during evolution. However, it remains unclear how the redundant proteins produced by gene duplication ultimately acquire new interaction specificities to establish insulated paralogous signaling pathways. Here, we used ancestral sequence reconstruction to resurrect and characterize a bacterial two-component signaling system that duplicated in α-proteobacteria. We determined the interaction specificities of the signaling proteins that existed before and immediately after this duplication event and then identified key mutations responsible for establishing specificity in the two systems. Just three mutations, in only two of the four interacting proteins, were sufficient to establish specificity of the extant systems. Some of these mutations weakened interactions between paralogous systems to limit crosstalk. However, others strengthened interactions within a system, indicating that the ancestral interaction, although functional, had the potential to be strengthened. Our work suggests that protein-protein interactions with such latent potential may be highly amenable to duplication and divergence.

Article activity feed

  1. Author Response

    Reviewer #3 (Public review):

    Weaknesses:

    1. The authors reconstruct a single phylogenetic tree for both the HK and RR components, concatenating the sequences together and then performing a single analysis. This could be problematic. First, if horizontal gene transfer occurred for one, but not the other, partner, the gene trees for the HK and RR components could be discordant. In this scenario, the reconstructed sequences would be incorrect because they were done on an a prior concordant tree. Second, there was insufficient detail in the methods to know how the matched pairs of HK/RR sequences were generated. If the authors inadvertently mixed up paralogs (e.g. generating incorrect HK1-RR2 or HK2-RR1 concatenations) this could lead to a poor phylogenetic inference. A simple way to check for both problems would be to generate phylogenetic trees for HK and RR separately and check for tree concordance. If the separate trees are concordant, the concatenated sequences are justified. If the separate trees are discordant, the authors would have to determine whether independent reconstructions would alter their reconstructed sequences.

    Discussed in Essential Revisions, above. In addition, a better description was added to the methods section to specify that only adjacent HK and RR sequences were matched, and any ambiguous clusters of two component systems were removed from the analysis to avoid this type of artifact.

    1. The authors use a simple in vitro phosphorylation assay as their assay for the ability of HK to phosphorylate RR. There were, however, two aspects of the assay that were not clear in the text.

    2A) First, the authors built their quantification around tracking the depletion of phosphorylated HK. There were a number of variants that showed much slower HK dephosphorylation than others, with barely detectable RR phosphorylation. A sceptical reviewer might wonder if this is slow activity represents specific dephosphorylation or instead spontaneous dephosphorylation to inorganic phosphate. (If the latter, the reconstructed protein is not really functional at all). An appropriate negative control would be tracking the rate of dephosphorylation of HK with no RR added.

    This is a reasonable concern, and a new figure has been added (Figure 1 – Figure Supplement 2) to show that each HK is stably phosphorylated over the 30 minutes timecourse when no RR is added. A sentence has also been added to the results to reference this control (last paragraph of “EnvZ/OmpR has undergone duplication and diversification in alpha-proteobacteria”).

    2B) Second, the authors used this assay to compare relative catalytic efficiencies (kcat/KM) of their variants. It was unclear how they extracted this information from the data as presented, which consist of a single velocity curve determined at a fixed concentration of HK and RR. In most contexts, obtaining kcat/KM requires measuring V0 vs. S0. More information on what precisely is being reported is necessary. (I should note that their qualitative results, looking at the gels, won't be affected by this; just statements like a 28-fold preference of ancHK2 for ancRR2 vs. ancRR1).

    We have added a better explanation of how these relative catalytic efficiencies were calculated both to the results section (penultimate paragraph of “Ancestral protein reconstruction reveals early acquisition of paralog specificity”) and to the methods section.

    1. There are a number of places where existing work in the field could be cited more appropriately. The authors argue in a couple of places that ASR has not been used to reconstruct historical protein-protein interactions; however, this is not true. Examples include: Holinksi Proteins 2016 https://doi.org/10.1002/prot.25225; Wheeler et al 2018 Biochemistry https://doi.org/10.1021/acs.biochem.7b01086; Lauren et al. MBE 2020. https://doi.org/10.1093/molbev/msaa198; and Wheeler et al MBE 2021. https://doi.org/10.1093/molbev/msab019. Further, on p. 17, the authors cite Field and Matz MBE 2010 as an example of a study looking at the evolution of protein/small-molecule interactions. This is not true: that study looked at the evolution of GFP-like protein color.

    Thank you for this suggestion, and for noting the error. A sentence in the discussion has been changed to clarify that this isn't the first study of ancestral sequence reconstruction for proteinprotein interactions. The incorrect reference has been removed from the discussion and references have been added for the other noted protein-protein ASR papers in the introduction.

    1. The authors suggest in a few places that the deepest ancestor (ancHK/ancRR) was not optimized for phosphate transfer because this activity improves for later ancestors. An alternative interpretation is that these deepest ancestors are relatively poorly reconstructed, and thus that overall activity is lower. Indeed the alternate reconstruction of ancHK-alt/ancRR-alt barely showed detectable activity. As such, I think the poor reconstruction hypothesis is much more likely than a suboptimal ancestral function that was subsequently optimized.

    This is a fair criticism and we have added a sentence to acknowledge it explicitly in the discussion.

  2. Evaluation Summary:

    This manuscript is of interest to protein biochemists, protein engineers, and those interested in molecular evolution. The computation and experiments presented in this paper are very logical and rigorously performed. The results provide an example of how protein interaction specificity can be rewired using a small number of mutations, in the context of ancestral sequence reconstruction.

    (This preprint has been reviewed by eLife. We include the public reviews from the reviewers here; the authors also receive private feedback with suggested changes to the manuscript. Reviewer #1 agreed to share their name with the authors.)

  3. Reviewer #1 (Public Review):

    It is thought that gene duplications are a major factor driving the emergence and expansion of paralogous proteins. But how redundant proteins evolve to acquire new interactions specificities is poorly understood.

    Here the authors focused on understanding how specificity evolves in signaling pathways involving two-component systems. In bacteria two-component systems contain a sensor histidine kinase (HK) and cognate response regulator (RR), usually encoded in the same operon. Upon signal recognition by the kinase, signal transduction is activated resulting in the phosphorylation of the response regulator. Most bacteria have dozens of these systems, however cross-talk between different response regulators and non-cognate response regulators is usually minimal. This specificity was shown to be explained by a small number of amino acids residues involved in the interaction between the histidine kinase and the response regulator responsible for preventing cross-talk between non-cognate proteins.

    How does this specificity evolves to prevent cross-talk between recently duplicated pairs, is the question the authors address here. They assume that for novel signaling pathways to occur newly insulated pathways need to be established following a duplication event. They asked how many mutations, and in how many of the proteins in each pair would need to acquire mutations to obtain insulated pathways with absence or minimal cross-talk between non-cognate proteins.

    They used a bio-informatics approach (ancestral protein reconstruction) to infer the sequence of ancestral histidine kinase (ancHK) and ancestral response regulator (ancRR) existing prior duplication events, and to infer two pairs of ancestral paralogous post-duplication. They cloned and expressed this proteins, and measured the specificity of all these reconstructed proteins. They observed that each kinase showed slower transfer (thus lower specificity) to the non-cognate partner. Similarly, to the currently occurring counterparts in Calaubacter crescentus.
    Then they identified and tested mutations predicted to be involved in specificity. They identify three residues involved in preventing cross-talk between non-cognate pairs of two pairs of paralogous proteins, and propose that these residues were sufficient for the establishment of specificity following the duplication event. Moreover, they propose that these 3 mutations would be enough to establish insulation of two signaling pathways. Interestingly, these mutations affect only one protein from each of the two-component system studied. Showing that specificity and insulation of pathways can be achieved without the need of mutations in all the proteins involved. This was surprising as initially expectations were that to obtain insulated pathways after a duplication event all the four proteins involved would have to acquire mutations in the two protein-protein interfaces involved. These finding highlight the importance of acquisition of mutation that prevent interactions with non-cognate proteins in insulation and in the emergence of novel signaling pathways.

    To understand how general these findings are additional work with other paralogous interacting proteins would be needed.

    In summary, here the authors observed that the inferred duplicated paralogous gained specificity towards their cognate pairs in comparison with the inferred ancestral proteins. They used a clever approach to identify residues involved in preventing cross-talk between non-cognate protein paralogous and they propose that these residues were responsible for driving insulation of paralogous pathways post-duplication.

    The manuscript is well written and technically sound.

  4. Reviewer #2 (Public Review):

    Nocedal and Laub used ancestral sequence reconstruction to study how protein interaction specificity evolves following a gene duplication event. They considered the EnvZ-OmpR two component signaling system. This histidine kinase-response regulator pair underwent a duplication event in alpha-proteobacteria to yield the paralogous EnvZ1-OmpR1 and EnvZ2-OmpR2 systems in Caulobacter crescentus. The authors used ancestral sequence reconstruction to design maximum likelihood EnvZ-OmpR sequences immediately prior to this gene duplication and just after the gene duplication event. They then characterized these ancestral sequences in vitro, and compared their interaction specificities to those of the extant, native EnvZ-OmpR systems. They found that three mutations introduced soon after gene duplication are sufficient to rewire signaling specificity. Interestingly, the mutations are not isolated to one of the duplicated pairs, but instead are spread across the histidine kinase of one pair and the response regulator of the other. This is consistent with a model in which signaling cross-talk is avoided through sub-functionalization of both protein pairs, rather than neofunctionalization of only one. The experiments appear very well done, the data are presented logically, and the results strongly support their conclusions.

    These experiments nicely add to a growing body of work from the Laub lab that shows how a small number of mutations can lead to new specificity and prevention of cross-talk. Prior work from the same lab has already established that a small number of mutations can rewire specificity in the context of rationally engineered sequences (e.g. Capra 2010, McClune 2019, and Skerker 2008, as cited by the authors), here this is demonstrated using ancestral sequence reconstruction. More generally, we know that protein interfaces are marginally stable, and that interface stability is typically the result of a small number of thermodynamically favorable interactions (classically: Clackson and Wells (1995) Science v.267(383)). Given prior work from the Laub lab, as well as engineering studies from other groups on different model proteins (e.g: Kapp et al (2012) PNAS v.109:5277), it is well-expected only a small number of mutations are necessary to rewire specificity. Thus, the results are not particularly surprising, though the use of ancestral sequence reconstruction brings an interesting new dimension to exploring protein interaction specificity. While the data imply a model for evolution of protein-protein interaction specificity through sub-functionalization rather than neofunctionalization for this particular protein pair, it is unclear that this finding should generalize to other protein pairs (or even that a general strategy for evolving specificity post gene duplication should exist at all).

  5. Reviewer #3 (Public Review):

    The authors set out to understand how proteins arising from gene duplication evolve to avoid cross-talk. The used ancestral sequence reconstruction to infer ancestral sequences for two paralogs of bacterial histidine kinases (HK) and their downstream response regulators (RR). They focused on three ancestral pairs of proteins: the preduplicate ancestor (ancHK/ancRR), the paralog 1 ancestor (ancHK1/ancRR1), and the paralog 2 ancestor (ancHK2/ancRR2). They found that ancHK and ancRR were compatible with each other and their descendants. Post duplication, each HK/RR pair gained specificity: ancHK1 prefers ancRR1 to ancRR2; ancHK2 prefers ancRR2 to ancRR1. They traced these differences in specificity to sequence changes in ancHK2 and ancRR1. ancHK2 acquired two mutations that excluded interaction with ancRR1. ancRR1 acquired a mutation that both disrupted its interaction with ancHK2 and increased its interaction with ancHK1. This reveals two different mechanisms for decreased cross-talk: exclude interaction with a non-cognate partner (mutations that disrupt ancHK2/ancRR1 interaction) or promote interaction with the cognate partner (mutation that improves ancHK1/ancRR1 interaction).

    Strengths:

    The story the authors present is clear, well-written, and intriguing. I find it particularly fascinating that the two lineages disrupted cross-talk by different mechanisms, and that this involved changes to either the HK or the RR, but not both on the same lineage. The phylogenetic and experimental results are mostly clear and convincing. Such work dissecting the evolutionary biochemical mechanisms by which signaling pathways diverge is important for a number of reasons. 1) It can reveal hidden mechanisms of specificity not obvious from simple structural analyses (for example, the mutations this study revealed that allow separation of paralog 1 and paralog 2 signaling.); 2) It helps us understand how prima facie complex evolutionary problems can be readily resolved with only a few mutations; and 3) It shows the "design principles" of these signaling pathways, namely that avoiding cross-talk can likely be achieved by relatively small changes to kinetics of the interactions between non-cognate and cognate proteins.

    Weaknesses:

    1. The authors reconstruct a single phylogenetic tree for both the HK and RR components, concatenating the sequences together and then performing a single analysis. This could be problematic. First, if horizontal gene transfer occurred for one, but not the other, partner, the gene trees for the HK and RR components could be discordant. In this scenario, the reconstructed sequences would be incorrect because they were done on an a prior concordant tree. Second, there was insufficient detail in the methods to know how the matched pairs of HK/RR sequences were generated. If the authors inadvertently mixed up paralogs (e.g. generating incorrect HK1-RR2 or HK2-RR1 concatenations) this could lead to a poor phylogenetic inference. A simple way to check for both problems would be to generate phylogenetic trees for HK and RR separately and check for tree concordance. If the separate trees are concordant, the concatenated sequences are justified. If the separate trees are discordant, the authors would have to determine whether independent reconstructions would alter their reconstructed sequences.

    2. The authors use a simple in vitro phosphorylation assay as their assay for the ability of HK to phosphorylate RR. There were, however, two aspects of the assay that were not clear in the text.

    2A: First, the authors built their quantification around tracking the depletion of phosphorylated HK. There were a number of variants that showed much slower HK dephosphorylation than others, with barely detectable RR phosphorylation. A sceptical reviewer might wonder if this is slow activity represents specific dephosphorylation or instead spontaneous dephosphorylation to inorganic phosphate. (If the latter, the reconstructed protein is not really functional at all). An appropriate negative control would be tracking the rate of dephosphorylation of HK with no RR added.

    2B: Second, the authors used this assay to compare relative catalytic efficiencies (kcat/KM) of their variants. It was unclear how they extracted this information from the data as presented, which consist of a single velocity curve determined at a fixed concentration of HK and RR. In most contexts, obtaining kcat/KM requires measuring V0 vs. 0. More information on what precisely is being reported is necessary. (I should note that their qualitative results, looking at the gels, won't be affected by this; just statements like a 28-fold preference of ancHK2 for ancRR2 vs. ancRR1).

    3. There are a number of places where existing work in the field could be cited more appropriately. The authors argue in a couple of places that ASR has not been used to reconstruct historical protein-protein interactions; however, this is not true. Examples include: Holinksi Proteins 2016 https://doi.org/10.1002/prot.25225; Wheeler et al 2018 Biochemistry https://doi.org/10.1021/acs.biochem.7b01086; Lauren et al. MBE 2020. https://doi.org/10.1093/molbev/msaa198; and Wheeler et al MBE 2021. https://doi.org/10.1093/molbev/msab019. Further, on p. 17, the authors cite Field and Matz MBE 2010 as an example of a study looking at the evolution of protein/small-molecule interactions. This is not true: that study looked at the evolution of GFP-like protein color.

    4. The authors suggest in a few places that the deepest ancestor (ancHK/ancRR) was not optimized for phosphate transfer because this activity improves for later ancestors. An alternative interpretation is that these deepest ancestors are relatively poorly reconstructed, and thus that overall activity is lower. Indeed the alternate reconstruction of ancHK-alt/ancRR-alt barely showed detectable activity. As such, I think the poor reconstruction hypothesis is much more likely than a suboptimal ancestral function that was subsequently optimized.

  6. Reviewer #4 (Public Review):

    In this paper, Nocedal and Laub address an important question - how signal transduction proteins maintain the specificity of their interactions upon gene duplication, which is the major driving force in signaling innovation. The authors used ancestral sequence reconstruction from a well-defined dataset of paralogous pairs of EnvZ histidine kinase and OmpR response regulator in alphaproteobacteria and tested the specificity of interactions within and between pairs of reconstructed ancestors and extant paralogs. The specificity of interaction was measured as the level of phosphotransfer between a kinase and a response regulator, with an assumption that the faster the process the more specific interaction. As a result, key mutations responsible for establishing specificity in paralogous systems were identified and their role confirmed in a series of experiments. The main finding is that only three mutations that, surprisingly, occur in the histidine kinase of one system and the response regulator of the other system, were sufficient to establish this specificity. This is a remarkable result, because it provides a deep mechanistic insight into how two-component systems limit potential crosstalk and because, in general, we know very little about the specificity of protein-protein interactions.

    One of the main strengths of this study is a careful choice of the dataset to perform ancestral reconstruction. As acknowledged by the authors, this method is probabilistic and, in addition to known successes (many of which are cited here) produces failures as well (most of which remain unpublished). Thus, it is encouraging to see the successful resurrection of modelled ancestors reported in this study. Results of this reconstruction agree with a generally accepted concept that proteins undergo optimization, in some cases still ongoing, throughout the course of evolution.

    The paper is very clearly written.