Down the Penrose stairs, or how selection for fewer recombination hotspots maintains their existence

Curation statements for this article:
  • Curated by eLife

    eLife logo

    eLife assessment

    This study reports an important theoretical model with simulations of meiotic recombination hotspots and Prdm9 evolution. By integrating recently identified biological properties of Prdm9, the model provides compelling evidence for novel features of hotspots and Prdm9 evolution. Yet, the model, the different steps in implementing parameters, and the predictions are difficult to follow and would benefit from clarification.

This article has been Reviewed by the following groups

Read the full article See related articles

Abstract

In many species, meiotic recombination events tend to occur in narrow intervals of the genome, known as hotspots. In humans and mice, double strand break (DSB) hotspot locations are determined by the DNA-binding specificity of the zinc finger array of the PRDM9 protein, which is rapidly evolving at residues in contact with DNA. Previous models explained this rapid evolution in terms of the need to restore PRDM9 binding sites lost to gene conversion over time, under the assumption that more PRDM9 binding always leads to more DSBs. This assumption, however, does not align with current evidence. Recent experimental work indicates that PRDM9 binding on both homologs facilitates DSB repair, and that the absence of sufficient symmetric binding disrupts meiosis. We therefore consider an alternative hypothesis: that rapid PRDM9 evolution is driven by the need to restore symmetric binding because of its role in coupling DSB formation and efficient repair. To this end, we model the evolution of PRDM9 from first principles: from its binding dynamics to the population genetic processes that govern the evolution of the zinc finger array and its binding sites. We show that the loss of a small number of strong binding sites leads to the use of a greater number of weaker ones, resulting in a sharp reduction in symmetric binding and favoring new PRDM9 alleles that restore the use of a smaller set of strong binding sites. This decrease, in turn, drives rapid PRDM9 evolutionary turnover. Our results therefore suggest that the advantage of new PRDM9 alleles is in limiting the number of binding sites used effectively, rather than in increasing net PRDM9 binding. By extension, our model suggests that the evolutionary advantage of hotspots may have been to increase the efficiency of DSB repair and/or homolog pairing.

Article activity feed

  1. Author Response

    Reviewer #1 (Public Review):

    This paper addresses the question of Prdm9-dependent hotspots and Prdm9 alleles evolution. Two properties underlie this question: the erosion of hotspots by biased gene conversion and the high mutation rate of the Prdm9 zinc finger domain. Here the authors include an additional recently observed property of Prdm9: its role in DSB repair, by enhancing DSB repair efficiency when binding on both homologs (symmetric sites). The status of symmetric binding depends on Prdm9 level and affinity, possibly other factors. The authors present a model for simulating Prdm9 and hotspots co-evolution based on several assumptions (Number of DSB independent of Prdm9, two types of hotspots, strong or weak; hotspots compete; at least one symmetric DSB is required on the smallest autosome). Although the in vivo context is obviously more complex, these assumptions are reasonable (except for the number of Prdm9 bound sites) as they qualitatively recapitulate or get close to what is known about the requirement for fertility. The model leads to several important conclusions and predictions that Prdm9 limits the number of sites used since such conditions are predicted to allow for a weaker contribution of asymmetric sites.

    The presentation of the model is clear, but the results are difficult to follow and require many readings to follow the text and the associated figures.

    We edited the results section to make the progression of the argument clearer (as detailed below).

    A few specific points also require clarification:

    Competition: It seems that in the context defined Prdm9 is limiting (since most Prdm9 can be bound to all weak sites); in addition, it is not clear how the competition for DSB activity between Prdm9 sites is taken into account.

    We now clarify throughout the text that we have assumed conditions under which PRDM9 is limiting (as detailed below). We state in the Model that we assume “all PRDM9 bound sites are equally likely to experience a DSB”.

    The number of Prdm9-bound sites in vivo is not known, thus several values must be tested.

    We have run additional simulations (when considering strong and weak hotspots, k_1=5 or 50, and when considering large and small population sizes, N= 10^3 or 10^6), using P_T = 500, 1000 and 2500. The results of these simulations are included and discussed in Appendix 4.

    It would be interesting to discuss the model prediction in the context of several observations published on hybrids with variable Prdm9 gene dosage.

    We now include a section in the Discussion, entitled “PRDM9-mediated hybrid sterility”, which discusses the reported gene dosage effects in mice.

    Reviewer #2 (Public Review):

    In mammalian genomes (with some exceptions), the location of recombination hotspots is driven by the PRDM9 zinc-finger protein that recognizes some specific DNA motifs and recruits the machinery inducing double-strand breaks (DSBs) initiating recombination. As DSBs are repaired with the homologous chromosome, "hot motifs" can be rapidly eroded through gene conversion occurring during the repair. This led to the "hotspot paradox" question and to the development of red queen models of hotspot evolution where the lack of enough DSB motifs can select for new PRDM9 alleles recognizing new sets of motifs, which in turn are eroded. However, this model fails to explain some observations, in particular, that the number of DSB seems not limited by PRDM9 sites. Recent findings also showed that PRDM9 played a central role in the symmetrical binding of homologous chromosomes.

    In this study, the author incorporated this new finding (and more realistic assumptions compared to previous models) in a model of hotspot evolution. Their main result is that it affects the evolution dynamics and in particular the causes of selection on new PRDM9 alleles. Instead of selection pressure to increase the number of DSB targets, they showed that selection likely occurred instead to limit the number of hotspots to the hottest and symmetrical ones. These results are important as they changed our view and understanding of the evolution of mammalian hotspots and should have general implications for the study of recombination. The article focuses on complex mechanisms and can appear rather specific and technical. However, it nicely exemplifies the importance of taking molecular mechanisms into account to model genome evolution.

    Overall, the model is sound with no apparent flaw and should be an important contribution to the field. The model is rather complex but the authors focused on a few key parameters while fixing others based on empirical knowledge. This allows for highlighting the novelty of the results without being lost within too many scenarios and hypotheses. However, two main issues should be addressed but they mostly concern the way the model and the results are presented and do not. First, partly due to the complexity of the mechanisms, the core of the manuscript is rather difficult to follow and would deserve a more careful and explicit presentation to guide the reader, as detailed below. Second, the implications of the model and the practical and testable predictions it makes could be developed more, in particular, to compare with previous models. The main comments are listed below.

    1. The introduction reads very well and clearly explains complex mechanisms. It is a bit long and could be reduced a bit.

    Following this suggestion, we have reduced the length of the Introduction.

    1. It is quite helpful to analyze the model step by step. However, the objective of each step is not clearly explained, and it is left to the reader to understand where the authors want to go. At first read, it is not clear whether the authors present an analysis of the model or simulation results and why they do that. So, the results part deserves rewriting and re-organization to guide the reader.
    • In the two first parts (Fitness with one heat and two heats) it should be stated more explicitly that it corresponds to an analysis of the fitness landscapes generated by the molecular mechanisms than results on the evolutionary dynamics
    • The part "Dynamics of the two-heat model" corresponds to simulations and it is only at this point that mutation on PRDM9 is introduced.
    • In the present form, the presentation of the results describes many mechanisms (which is fine). However, as the model is complex, stressing the main conclusion for each part could be useful as then making a clear link between the different steps of the reasoning.

    We have rewritten the results sections to include more signposting and to make clearer the intentions behind each step taken.

    1. The choice of key parameters is well justified with a detailed review of the literature and it is well justified to fix most of them to focus on the key unknown (or not well-known) ones. However, in a few cases, additional simulations or at least better justification would be welcome, in particular on the mutation dynamics of PRDM9.

    Thank you for your suggestion. We have now added an additional appendix (Appendix 5), which investigates the dynamics of our model when newly arising PRDM9 alleles are initiated with hotspot numbers set near values that would be reasonable for perfect matches to motifs with 10 or 11 non-degenerate bases. We show that this sometimes affects the dynamics (compared to the case in the main text), but when it does, the differences can be readily understood using the same kind of reasoning developed in the main text.

    1. The model clearly gives new insights into the evolution of recombination hotspots and appears better to explain some results. However, it is not clear what are the predictions of the model that could be properly tested with data, in particular against previous models. Some predictions are proposed but remain mainly qualitative. For example, can one quantify that this model predicts a skewer distribution of hotspots compared to previous red-queen models? How good is the model at predicting the number of PRDM9 alleles in human and mouse for example? Only the diversity at PRDM9 is given, it may be interesting to also give the number of alleles to compare to observations. The discussion on this remains a bit vague. Finally, are there additional predictions of the model that could be used to test it?

    In previous Red Queen models, the specific distribution of heats was not important: fitness was determined by the sum of the heats of all available binding sites. Accordingly, these models do not predict a specific distribution, only that PRDM9 alleles that bind more overall would be favored. Our model thus provides the first theoretical framework under which there is an explicit benefit to localizing PRDM9 to smaller numbers of loci, a premise consistent with the use of hotspots, i.e., the use of only a small proportion of the genome for recombination.

    We chose the two-heat model as a reasonable first approximation to the true distribution. If we were to consider a more realistic binding distribution (or similarly, if we relaxed our assumption about most PRDM9 molecules being bound), the quantitative conclusions would likely be affected. Accordingly, while our simplified model provides robust insights into the dynamics of PRDM9 evolution, quantities such as the predicted levels of diversity in our model may be off and cannot be readily compared to what is observed in human and mice populations. We now better clarify the scope of our results and what may be done to extend it, in the Discussion.

    1. The Penrose stair metaphor is appealing but it seems to be dependent on the definition of hotspot, so not to represent a real biological process. Related to metaphors, it is also not very clear whether the authors suggest abandoning the red-queen metaphor for the benefit of the Penrose stair one. Actually, we can still consider that it is a red-queen dynamics but with a different underlying driver.

    We have expanded our discussion of the difference between these two analogies in the discussion section “Does the decay of hotspots by GC lead to more or fewer hotspots?” to clarify that the Penrose stairs model is a specific kind of Red Queen model. However, precisely because a hotspot has a somewhat arbitrary definition, we can imagine her running in either direction–towards fewer or more hotspots– depending on our perspective on the Penrose stairs.

  2. eLife assessment

    This study reports an important theoretical model with simulations of meiotic recombination hotspots and Prdm9 evolution. By integrating recently identified biological properties of Prdm9, the model provides compelling evidence for novel features of hotspots and Prdm9 evolution. Yet, the model, the different steps in implementing parameters, and the predictions are difficult to follow and would benefit from clarification.

  3. Reviewer #1 (Public Review):

    This paper addresses the question of Prdm9-dependent hotspots and Prdm9 alleles evolution. Two properties underlie this question: the erosion of hotspots by biased gene conversion and the high mutation rate of the Prdm9 zinc finger domain. Here the authors include an additional recently observed property of Prdm9: its role in DSB repair, by enhancing DSB repair efficiency when binding on both homologs (symmetric sites). The status of symmetric binding depends on Prdm9 level and affinity, possibly other factors. The authors present a model for simulating Prdm9 and hotspots co-evolution based on several assumptions (Number of DSB independent of Prdm9, two types of hotspots, strong or weak; hotspots compete; at least one symmetric DSB is required on the smallest autosome). Although the in vivo context is obviously more complex, these assumptions are reasonable (except for the number of Prdm9 bound sites) as they qualitatively recapitulate or get close to what is known about the requirement for fertility. The model leads to several important conclusions and predictions that Prdm9 limits the number of sites used since such conditions are predicted to allow for a weaker contribution of asymmetric sites.

    The presentation of the model is clear, but the results are difficult to follow and require many readings to follow the text and the associated figures.

    A few specific points also require clarification:
    Competition: It seems that in the context defined Prdm9 is limiting (since most Prdm9 can be bound to all weak sites); in addition, it is not clear how the competition for DSB activity between Prdm9 sites is taken into account.

    The number of Prdm9-bound sites in vivo is not known, thus several values must be tested.

    It would be interesting to discuss the model prediction in the context of several observations published on hybrids with variable Prdm9 gene dosage.

  4. Reviewer #2 (Public Review):

    In mammalian genomes (with some exceptions), the location of recombination hotspots is driven by the PRDM9 zinc-finger protein that recognizes some specific DNA motifs and recruits the machinery inducing double-strand breaks (DSBs) initiating recombination. As DSBs are repaired with the homologous chromosome, "hot motifs" can be rapidly eroded through gene conversion occurring during the repair. This led to the "hotspot paradox" question and to the development of red queen models of hotspot evolution where the lack of enough DSB motifs can select for new PRDM9 alleles recognizing new sets of motifs, which in turn are eroded. However, this model fails to explain some observations, in particular, that the number of DSB seems not limited by PRDM9 sites. Recent findings also showed that PRDM9 played a central role in the symmetrical binding of homologous chromosomes.

    In this study, the author incorporated this new finding (and more realistic assumptions compared to previous models) in a model of hotspot evolution. Their main result is that it affects the evolution dynamics and in particular the causes of selection on new PRDM9 alleles. Instead of selection pressure to increase the number of DSB targets, they showed that selection likely occurred instead to limit the number of hotspots to the hottest and symmetrical ones. These results are important as they changed our view and understanding of the evolution of mammalian hotspots and should have general implications for the study of recombination. The article focuses on complex mechanisms and can appear rather specific and technical. However, it nicely exemplifies the importance of taking molecular mechanisms into account to model genome evolution.

    Overall, the model is sound with no apparent flaw and should be an important contribution to the field. The model is rather complex but the authors focused on a few key parameters while fixing others based on empirical knowledge. This allows for highlighting the novelty of the results without being lost within too many scenarios and hypotheses. However, two main issues should be addressed but they mostly concern the way the model and the results are presented and do not. First, partly due to the complexity of the mechanisms, the core of the manuscript is rather difficult to follow and would deserve a more careful and explicit presentation to guide the reader, as detailed below. Second, the implications of the model and the practical and testable predictions it makes could be developed more, in particular, to compare with previous models. The main comments are listed below.

    1. The introduction reads very well and clearly explains complex mechanisms. It is a bit long and could be reduced a bit.
    2. It is quite helpful to analyze the model step by step. However, the objective of each step is not clearly explained, and it is left to the reader to understand where the authors want to go. At first read, it is not clear whether the authors present an analysis of the model or simulation results and why they do that. So, the results part deserves rewriting and re-organization to guide the reader.
      - In the two first parts (Fitness with one heat and two heats) it should be stated more explicitly that it corresponds to an analysis of the fitness landscapes generated by the molecular mechanisms than results on the evolutionary dynamics
      - The part "Dynamics of the two-heat model" corresponds to simulations and it is only at this point that mutation on PRDM9 is introduced.
      - In the present form, the presentation of the results describes many mechanisms (which is fine). However, as the model is complex, stressing the main conclusion for each part could be useful as then making a clear link between the different steps of the reasoning.
    3. The choice of key parameters is well justified with a detailed review of the literature and it is well justified to fix most of them to focus on the key unknown (or not well-known) ones. However, in a few cases, additional simulations or at least better justification would be welcome, in particular on the mutation dynamics of PRDM9.
    4. The model clearly gives new insights into the evolution of recombination hotspots and appears better to explain some results. However, it is not clear what are the predictions of the model that could be properly tested with data, in particular against previous models. Some predictions are proposed but remain mainly qualitative. For example, can one quantify that this model predicts a skewer distribution of hotspots compared to previous red-queen models? How good is the model at predicting the number of PRDM9 alleles in human and mouse for example? Only the diversity at PRDM9 is given, it may be interesting to also give the number of alleles to compare to observations. The discussion on this remains a bit vague. Finally, are there additional predictions of the model that could be used to test it?
    5. The Penrose stair metaphor is appealing but it seems to be dependent on the definition of hotspot, so not to represent a real biological process. Related to metaphors, it is also not very clear whether the authors suggest abandoning the red-queen metaphor for the benefit of the Penrose stair one. Actually, we can still consider that it is a red-queen dynamics but with a different underlying driver.
  5. Reviewer #3 (Public Review):

    In this paper, Baker and colleagues present a model for the evolutionary dynamics of PRDM9 - the protein that determines where recombinations occur in many species. PRDM9 is one of the most rapidly evolving proteins and theoretical models have been developed to understand why it evolves so rapidly. The most popular of these models assumes that PRDM9 (indirectly) causes double-strand breaks where it binds DNA, and this in turn causes the erosion of its binding sites. Over time, this reduces the number of double-strand breaks, ultimately imperiling the proper segregation of chromosomes and hence causing selection for a new PRDM9 allele that can bind new sites. Unfortunately, recent experimental evidence has shown that PRDM9 merely positions where double-strand breaks occur and that the number of double-strand breaks is controlled independently of PRDM9. This new understanding of the biology of PRDM9 then casts doubt on the previous model for why PRDM9 evolves so rapidly, demanding a new explanation.

    This paper takes this updated view of the biology of PRDM9 and formalizes it into a mathematical model of how evolution will act on different PRDM9 alleles and their binding sites. The model is very carefully couched in our current understanding of PRDM9 and is solidly analyzed. Altogether, this paper convincingly reconciles the rapid evolution of PRDM9 and the rapid erosion of its hotspots with the biological finding that PRDM9 itself does not drive double-strand break formation.