Admixture of evolutionary rates across a butterfly hybrid zone

Curation statements for this article:
  • Curated by eLife

    eLife logo

    Evaluation Summary:

    The authors leverage theory, simulations, and empirical population genomics to evaluate what are the consequences of differences in substitution rates in hybridizing species. This is a largely overlooked phenomenon. This study highlights the issue and demonstrates that two hybridizing species of Papilio have differences in their substitution rates. The work will be of interest to a large group of evolutionary biologists, especially those studying evolution at the whole-genome level.

    (This preprint has been reviewed by eLife. We include the public reviews from the reviewers here; the authors also receive private feedback with suggested changes to the manuscript. Reviewer #1 agreed to share their name with the authors.)

This article has been Reviewed by the following groups

Read the full article See related articles

Abstract

Hybridization is a major evolutionary force that can erode genetic differentiation between species, whereas reproductive isolation maintains such differentiation. In studying a hybrid zone between the swallowtail butterflies Papilio syfanius and Papilio maackii (Lepidoptera: Papilionidae), we made the unexpected discovery that genomic substitution rates are unequal between the parental species. This phenomenon creates a novel process in hybridization, where genomic regions most affected by gene flow evolve at similar rates between species, while genomic regions with strong reproductive isolation evolve at species-specific rates. Thus, hybridization mixes evolutionary rates in a way similar to its effect on genetic ancestry. Using coalescent theory, we show that the rate-mixing process provides distinct information about levels of gene flow across different parts of genomes, and the degree of rate-mixing can be predicted quantitatively from relative sequence divergence ( F S T ) between the hybridizing species at equilibrium. Overall, we demonstrate that reproductive isolation maintains not only genomic differentiation, but also the rate at which differentiation accumulates. Thus, asymmetric rates of evolution provide an additional signature of loci involved in reproductive isolation.

Article activity feed

  1. Author Response

    Reviewer #1 (Public Review):

    Xiong and colleagues use an elegant combination of theory development, simulations, and empirical population genomics to interrogate a largely unexplored phenomenon in speciation/ hybridization genomics: the consequences and implications of admixture between species with differing substitution rates. The work presented in this well-written manuscript is thorough, thought provoking, and represents an important advancement for the field. However, there are a few instances where I feel the strength of the conclusions drawn is not fully supported.

    Thank you for the positive comments!

    The authors begin by presenting evidence based on whole genome sequencing that the two focal species, P. syfanius and P. maackii, are highly diverged despite ongoing hybridization. Though the discussion of remarkable mitochondrial sequence similarity is underdeveloped. I do not understand how such a pattern is not most likely the result of introgression from one species to the other given the relatively high FST across much of the nuclear genome coupled with the generally higher mitochondrial mutation rate in animals.

    That’s a very good point. We have included this likely explanation of mitochondrial genome similarity in Line 84-86.

    Next, they posit that barrier loci are likely to exist. To support this assertion, the authors use a combination of parental population genetic diversity and divergence comparisons and ancestry pattern analysis in hybrid populations. They show that there is a strong correlation between divergence across pure species and within species diversity across the autosomes. Then using four hybrid individuals they show that low ancestry randomness, as quantified estimates of between group and within group entropy, is associated with genomic region of reduced within group diversity and elevated between group divergence. The use of entropy estimates as a stand-in for admixture proportions and ancestry block analysis when sample size is severely limited is particularly clever. Though I must admit, I do not fully understand the derivations of the two entropy measures, it seems to me that relatedness might have a strong effect on the interpretability of between individual entropy estimates (Sb). With very small population sizes this may be a real issue.

    Yes, genetic relatedness will play a big role in between-individual entropy (Sb). A group of highly correlated individuals will produce highly predictable ancestry (knowing one individual’s local ancestry gives much information on the local ancestries of others), and Sb will be small because entropy is a measure of uncertainty. If inbreeding is very severe, Sb will no longer be a useful measure because it will be too small across the entire genome. In our hybrid samples, although some genomic regions imply the possibility of inbreeding (see local ancestry of Z chromosomes in Figure 3–Figure supplement 1), there is still considerable variation of Sb across the genome which allows us to test for its correlation with DXY and π.

    A brief discussion of potential caveats in using the new method developed here seems warranted given its potential usefulness to the population genomics field more broadly. One plausible but less likely alternative interpretation of these patterns is briefly discussed.

    We have now devoted the first subsection of Discussion to the caveats and various motivation for entropy metrics. The appendix also contains further explanation of our intuition (section “Appendix-The entropy of ancestry”).

    The authors then move on to evidence for divergent substitution rates. Analysis of both D3 and D4 statistics using several different outgroups and a series of progressively stringent FST thresholds shows that site patterns between the two species are highly asymmetrical with P. maackii lineage harboring more substitutions than P. syfanius. The authors offer two possible explanations for this finding and then test both hypotheses. First, they use a comparative tree-based method to show that there is little phylogenetic evidence for lineage biased hybridization from outgroups into either of the focal lineages. Further, the range overlaps of the study species do not correspond with the inferred direction of allele sharing from the Dstat analysis. This is a good argument against contemporary gene flow between the outgroups and P. syfanius, but I am not convinced that ancient gene flow that could have occurred when, say, species distributions may have been different, can be ruled out using this analysis.

    Yes, we also felt that our original wording was overly strong. Now we say that our argument is based on current geographic distributions, but that archaic gene flow cannot be totally ruled out. However, we also point out that archaic gene flow with outgroups should still leave some detectable fractions of paraphyletic local gene trees after phylogenetic reconstruction. (Line 192-194).

    To test whether this asymmetry can be explained by a difference in substitution rate between the two species the authors show that observed D3 increases and D4 decreases with increasingly divergent outgroups as predicted by theory developed here. The authors take this as evidence supporting the divergent substitution rates. Though they claim only that existence such rate divergence is likely. The unfortunately limited samples sizes seem to preclude attaining more certainty than this. Interestingly, as a byproduct of using D4 as an extended measure of site pattern asymmetry the authors highlight one way in which the ABBA-BABA test can give false positives for introgression. This is an important contribution to the field.

    We agree with the reviewer that, for our data type – a handful of unphased genomes, it will be difficult to obtain more direct evidence for substitution rate differences. In line 182-187, we show using maximum-likelihood gene tree reconstruction that P. maackii samples often inherit more derived mutations than P. syfanius. This could be viewed as a separate test utilizing more accurate substitution models in phylogenetic software, while our theoretical calculation provides a coarse but testable signature of D3 and D4.

    To provide more direct evidence, we believe one ought to measure spontaneous mutation rates in both species under their native habitats, and obtain better knowledge of generation times and population sizes. The limitation of sampling and rearing these rare species are major barriers for incorporating this kind of evidence into this study.

    Finally, the authors observe a monotonic relationship substitution rate ratio and relative genetic divergence across the genome which is in line with their theoretical predictions for differential substitution rates in the face of gene flow. From this they infer an 80% increase in substitution rate from P. syfanius to P. maackii. It is remarkable to be able to extract these substitution rates from genomic regions with the least gene flow. However the veracity of these estimates relies on the assumptions I have highlighted above and should be presented with appropriate caution.

    We have included the limitations of our conclusions in the final subsection of the Discussion. Because high FST regions are relatively rare, estimates of observed rate ratio “r” have larger errors in those regions. This problem is partially resolved by using the entire monotonic relationship between r and FST to estimate the true rate ratio, so we rely not only on regions with the least gene flow but the full dataset.

    However, we do agree with the reviewer that ours is still a coarse theoretical framework since we do not impose a realistic substitution model (e.g., we don’t allow reverse mutations). We have now emphasized this weakness in the Discussion (Line 348-350).

    Reviewer #2 (Public Review):

    In their manuscript ("Admixture of evolutionary rates across a hybrid zone"), Xiong et al. use whole genome resequencing data to assess rates of genome evolution between two species of butterflies and determine whether putative barrier loci between the species are also those that evolve at asymmetric rates between them. This work presents a novel hypothesis and rigorously tests these ideas using a combination of empirical and theoretical work. I think the authors could more formally link loci that are evolving at highly asymmetric rates with those that are most likely to be barrier loci by evaluating the relationship between ancestry entropy and ratios of substitution rates between species. Additionally, clarifying the relationship between barrier loci and asymmetric evolution would be beneficial (i.e. are loci that we typically envision to be barrier loci, such as loci involved in reproductive isolation, evolving at asymmetric rates or do asymmetrically evolving loci represent a new type of barrier loci?).

    Many thanks for these comments! For the second point (clarifying the relationship between barrier loci and asymmetric evolution), we specifically mean that barrier loci (which specifically are of interest to those who study speciation) cause asymmetric rates of evolution to be preserved between hybridizing species. Asymmetric rates themselves are caused by other factors (spontaneous mutation rate differences, generation times, environmental effects) specific to each species, and barrier loci merely prevent the mixing of asymmetric rates. For the first point (evaluating the relationship between entropy and ratios of substitution rates).

  2. Evaluation Summary:

    The authors leverage theory, simulations, and empirical population genomics to evaluate what are the consequences of differences in substitution rates in hybridizing species. This is a largely overlooked phenomenon. This study highlights the issue and demonstrates that two hybridizing species of Papilio have differences in their substitution rates. The work will be of interest to a large group of evolutionary biologists, especially those studying evolution at the whole-genome level.

    (This preprint has been reviewed by eLife. We include the public reviews from the reviewers here; the authors also receive private feedback with suggested changes to the manuscript. Reviewer #1 agreed to share their name with the authors.)

  3. Reviewer #1 (Public Review):

    Xiong and colleagues use an elegant combination of theory development, simulations, and empirical population genomics to interrogate a largely unexplored phenomenon in speciation/ hybridization genomics: the consequences and implications of admixture between species with differing substitution rates. The work presented in this well-written manuscript is thorough, thought provoking, and represents an important advancement for the field. However, there are a few instances where I feel the strength of the conclusions drawn is not fully supported.

    The authors begin by presenting evidence based on whole genome sequencing that the two focal species, P. syfanius and P. maackii, are highly diverged despite ongoing hybridization. Though the discussion of remarkable mitochondrial sequence similarity is underdeveloped. I do not understand how such a pattern is not most likely the result of introgression from one species to the other given the relatively high FST across much of the nuclear genome coupled with the generally higher mitochondrial mutation rate in animals.

    Next, they posit that barrier loci are likely to exist. To support this assertion, the authors use a combination of parental population genetic diversity and divergence comparisons and ancestry pattern analysis in hybrid populations. They show that there is a strong correlation between divergence across pure species and within species diversity across the autosomes. Then using four hybrid individuals they show that low ancestry randomness, as quantified estimates of between group and within group entropy, is associated with genomic region of reduced within group diversity and elevated between group divergence. The use of entropy estimates as a stand-in for admixture proportions and ancestry block analysis when sample size is severely limited is particularly clever. Though I must admit, I do not fully understand the derivations of the two entropy measures, it seems to me that relatedness might have a strong effect on the interpretability of between individual entropy estimates (Sb). With very small population sizes this may be a real issue. A brief discussion of potential caveats in using the new method developed here seems warranted given its potential usefulness to the population genomics field more broadly. One plausible but less likely alternative interpretation of these patterns is briefly discussed.

    The authors then move on to evidence for divergent substitution rates. Analysis of both D3 and D4 statistics using several different outgroups and a series of progressively stringent FST thresholds shows that site patterns between the two species are highly asymmetrical with P. maackii lineage harboring more substitutions than P. syfanius. The authors offer two possible explanations for this finding and then test both hypotheses.

    First, they use a comparative tree-based method to show that there is little phylogenetic evidence for lineage biased hybridization from outgroups into either of the focal lineages. Further, the range overlaps of the study species do not correspond with the inferred direction of allele sharing from the Dstat analysis. This is a good argument against contemporary gene flow between the outgroups and P. syfanius, but I am not convinced that ancient gene flow that could have occurred when, say, species distributions may have been different, can be ruled out using this analysis.

    To test whether this asymmetry can be explained by a difference in substitution rate between the two species the authors show that observed D3 increases and D4 decreases with increasingly divergent outgroups as predicted by theory developed here. The authors take this as evidence supporting the divergent substitution rates. Though they claim only that existence such rate divergence is likely. The unfortunately limited samples sizes seem to preclude attaining more certainty than this. Interestingly, as a byproduct of using D4 as an extended measure of site pattern asymmetry the authors highlight one way in which the ABBA-BABA test can give false positives for introgression. This is an important contribution to the field.

    Finally, the authors observe a monotonic relationship substitution rate ratio and relative genetic divergence across the genome which is in line with their theoretical predictions for differential substitution rates in the face of gene flow. From this they infer an 80% increase in substitution rate from P. syfanius to P. maackii. It is remarkable to be able to extract these substitution rates from genomic regions with the least gene flow. However the veracity of these estimates relies on the assumptions I have highlighted above and should be presented with appropriate caution.

  4. Reviewer #2 (Public Review):

    In their manuscript ("Admixture of evolutionary rates across a hybrid zone"), Xiong et al. use whole genome resequencing data to assess rates of genome evolution between two species of butterflies and determine whether putative barrier loci between the species are also those that evolve at asymmetric rates between them. This work presents a novel hypothesis and rigorously tests these ideas using a combination of empirical and theoretical work. I think the authors could more formally link loci that are evolving at highly asymmetric rates with those that are most likely to be barrier loci by evaluating the relationship between ancestry entropy and ratios of substitution rates between species. Additionally, clarifying the relationship between barrier loci and asymmetric evolution would be beneficial (i.e. are loci that we typically envision to be barrier loci, such as loci involved in reproductive isolation, evolving at asymmetric rates or do asymmetrically evolving loci represent a new type of barrier loci?).