Recent evolutionary origin and localized diversity hotspots of mammalian coronaviruses

Curation statements for this article:
  • Curated by eLife

    eLife logo

    eLife assessment

    Maestri et al report the absence of phylogenetic evidence supporting codiversification of mammalian coronaviruses and their hosts, leading to the important conclusion that the evolutionary history of the virus and its hosts are decoupled through frequent host switches. The evidence for frequent host switching, derived from a probabilistic model of co-evolution, appears convincing, but evidence for quantitative statements about the time of the last common ancestor of extant mammalian coronaviruses remains incomplete. The results would be strengthened by a reconstruction of the evolutionary timescale and further investigation of robustness to sampling biases and unsampled diversity.

This article has been Reviewed by the following groups

Read the full article See related articles

Abstract

Several coronaviruses infect humans, with three, including the SARS-CoV2, causing diseases. While coronaviruses are especially prone to induce pandemics, we know little about their evolutionary history, host-to-host transmissions, and biogeography. One of the difficulties lies in dating the origination of the family, a particularly challenging task for RNA viruses in general. Previous cophylogenetic tests of virus-host associations, including in the Coronaviridae family, have suggested a virus-host codiversification history stretching many millions of years. Here, we establish a framework for robustly testing scenarios of ancient origination and codiversification versus recent origination and diversification by host switches. Applied to coronaviruses and their mammalian hosts, our results support a scenario of recent origination of coronaviruses in bats and diversification by host switches, with preferential host switches within mammalian orders. Hotspots of coronavirus diversity, concentrated in East Asia and Europe, are consistent with this scenario of relatively recent origination and localized host switches. Spillovers from bats to other species are rare, but have the highest probability to be towards humans than to any other mammal species, implicating humans as the evolutionary intermediate host. The high host-switching rates within orders, as well as between humans, domesticated mammals, and non-flying wild mammals, indicates the potential for rapid additional spreading of coronaviruses across the world. Our results suggest that the evolutionary history of extant mammalian coronaviruses is recent, and that cases of long-term virus–host codiversification have been largely over-estimated.

Article activity feed

  1. Author Response

    Public Reviews:

    Reviewer #1 (Public Review):

    Summary:

    In this study, Maestri et al. use an integrative framework to study the evolutionary history of coronaviruses. They find that coronaviruses arose recently rather than having undergone ancient codivergences with their mammalian hosts. Furthermore, recent host switching has occurred extensively, but typically between closely related species. Humans have acted as an intermediate host, especially between bats and other mammal species.

    Strengths:

    The study draws on a range of data sources to reconstruct the history of virus-host codivergence and host switching. The analyses include various tests of robustness and evaluations through simulation.

    Weaknesses:

    The analyses are limited to a single genetic marker (RdRp) from coronaviruses, but using other sections of the genome might lead to different conclusions. The genetic marker also lacks resolution for recent divergences, which precludes the detailed examination of recent host switches. Careful and detailed reconstruction of the timescale would be helpful for clarifying the evolutionary history of coronaviruses alongside their hosts.

    The use of a single short genetic marker (the RdRp palmprint region) from coronaviruses is indeed a limitation. However, this marker is the one that is currently used for routinely delimiting operational taxonomic units in RNA viruses and reconstructing their evolutionary history (Edgar et al. 2022, see also the Serratus project; https://serratus.io/); therefore, we took the conscious decision early on to rely on this expertise. Unfortunately, this marker cannot provide robust timescale reconstructions for coronavirus evolution (previous estimates of coronavirus origin range from around 10 thousand years ago to 293 million years ago depending on modeling assumptions). Only future genomic work across Coronaviridae that will characterize multiple genetic regions with different evolutionary rates will allow us to precisely elucidate the timescale of the evolutionary history of coronaviruses alongside their hosts. In the meantime, we show here that, while the RdRp palmprint region cannot by itself resolve the precise timescale of coronavirus evolution, it strongly suggests, when used along with cophylogenetic approaches, a recent evolutionary origin in bats.

    R. C. Edgar, et al., Petabase-scale sequence alignment catalyses viral discovery. Nature 602, 142–147 (2022).

    Reviewer #2 (Public Review):

    Summary:

    In their study titled "Recent evolutionary origin and localized diversity hotspots of mammalian coronaviruses," authors Benoît Perez-Lamarque, Renan Maestri, Anna Zhukova, and Hélène Morlon investigate the complex evolutionary history of coronaviruses, particularly those affecting mammals, including humans. The study focuses on unraveling the evolutionary trajectory of these viruses, which have shown a high propensity for causing pandemics, as evidenced by the SARS-CoV2 outbreak.

    The research addresses a significant gap in our understanding of the evolutionary dynamics of coronaviruses, particularly their history, patterns of host-to-host transmission, and geographical spread. These aspects are important for predicting and managing future pandemic scenarios.

    Historically, studies have employed cophylogenetic tests to explore virus-host relationships within the Coronaviridae family, often suggesting a long history of virus-host codiversification spanning millions of years. However, the team led by Perez-Lamarque proposes a novel phylogenetic framework that contrasts this traditional view. Their approach, which involves adapting gene tree-species tree reconciliation, is designed to robustly test the validity of two competing scenarios: an ancient origination and codiversification versus a more recent emergence and diversification through host switching.

    Upon applying this innovative framework to the study of coronaviruses and their mammalian hosts, the authors' findings challenge the prevailing notion of a deep evolutionary history. Instead, their results strongly support a scenario where coronaviruses have a more recent origin, likely in bat populations, followed by diversification predominantly through host-switching events. This diversification, interestingly, seems to occur preferentially within mammalian orders.

    A critical aspect of their findings is the identification of hotspots of coronavirus diversity, particularly in East Asia and Europe. These regions align with the proposed scenario of a relatively recent origin and subsequent localized host-switching events. The study also highlights the rarity of spillovers from bats to other species, yet underscores the relatively higher likelihood of such spillovers occurring towards humans, suggesting a significant role for humans as an intermediate host in the evolutionary journey of these viruses.

    The research also points out the high rates of host-switching within mammalian orders, including between humans, domesticated animals, and non-flying wild mammals.

    In conclusion, the study by Perez-Lamarque and colleagues presents an important quantitative advance in our understanding of the evolutionary history of mammalian coronaviruses. It suggests that the long-held belief in extensive virus-host codiversification may have been substantially overestimated, paving the way for a reevaluation of how we understand, predict, and potentially control the spread of these viruses.

    Strengths:

    The study is conceptually robust, and its conclusions are convincing.

    Weaknesses:

    Despite the availability of a dated host tree the authors were only able to use the "undated" model in ALE, with the dated method (which only allows time-consistent transfers) failing on their dataset (possibly due to dataset size?). Further exploration of the question would be potentially valuable.

    Our intuition is that ALE in its “dated” version did not necessarily fail on our dataset due to its size (ALE ran, but provided unrealistic parameter estimates and was not able to output possible reconciliations, as mentioned in our Material and Methods section). We think it most likely did not run because there is no pattern of codiversification: the coronavirus and mammal trees are so distinct that finding a reconciliation scenario between these trees with time-consistent transfers is very difficult and ALE fails at estimating an amalgamated likelihood for such an unlikely scenario. Following a suggestion from reviewer #3, we are going to try running the dated version of ALE independently on the alpha and beta-coronaviruses, resulting in smaller datasets. This will help us elucidate whether the dated version of ALE fails due to data size or the absence of a codiversification pattern.

    Reviewer #3 (Public Review):

    Summary:

    This work uses tools and concepts from co-phylogenetic analyses to reconstruct the evolutionary and diversification history of coronaviruses in mammals. It concludes that cross-species transmissions from bats to humans are a relatively common event (compared to bats to other species). Across all mammals, the diversification history of coronaviruses suggests that there is potential for further evolutionary diversification.

    Strengths:

    The article uses an interesting approach based on jointly looking at the extant network of coronaviruses-mammals interactions, and the phylogenetic history of both these organisms. The authors do an impressive job of explaining the challenges of reconstructing evolutionary dynamics for RNA viruses, and this helps readers appraise the relevance of their approach.

    Weaknesses:

    I remain unconvinced by the argument that sampling does not introduce substantial biases in the analyses. As the authors highlight, incomplete knowledge of the extant interactions would lead to a biased reconstruction of the diversification history. In a recent paper (Poisot et al. 2023, Patterns), we look at sampling biases in the virome of mammals and suggest that is a fairly prominent issue, that is furthermore structured by taxonomy, space, and phylogenetic position. Case in point, even for betacoronaviruses, there have been many newly confirmed hosts in recent years. For organisms that have received less intense scrutiny, I think a thorough discussion of potential gaps in data would be required (see for example Cohen et al. 2022, Nat. Comms).

    I was also surprised to see little discussion of the differences between alpha and beta coronaviruses - there is evidence that they may differ in their cross-species transmission (see Caraballo et al. 2022 Micr. Spectr.), which could call into question the relevance of treating all coronaviruses as a single, homogeneous group.

    Some of the discussions in this paper also echo previous work by e.g. Geoghegan et al. (see 2017, PLOS Pathogens), which I was surprised to not see discussed, as it is a much earlier investigation of the relative frequencies of co-divergence and host switches for different viral families, with a deep discussion of how this may structure future evolutionary dynamics.

    We totally agree that sampling biases in the virome of mammals is a prominent issue, which is why we conducted a series of sensitivity analyses to test their effect on our main conclusions. We thoroughly tested the effect of (i) the unequal sampling effort across mammalian species that have been screened and (ii) the unequal screening of mammalian species across the mammalian tree of life by subsampling the data to correct for the unequal sampling effort (see Supporting Information Text). In both cases, we still reported low support for a scenario of codiversification, the origin in bats in East Asia, the preferential host switches within mammalian orders, and the rare spillovers from bats to humans. The robustness of our findings to sampling biases may be explained by the fact that the cophylogenetic approach we used (ALE) explicitly accounts for undersampling by assuming that all host transfers involve unsampled intermediate hosts. To address the reviewer's comment, we will better underline the importance of sampling biases in our main text and include the suggested references. We will also better highlight our sensitivity analyses by moving them from the Supporting Information Text to the main text.

    We agree that distinguishing between alpha and beta coronaviruses will provide useful additional insights; we are going to run separate cophylogenetic analyses for these two sub-clades. We will report the results of these additional analyses in the revised manuscript, and put them in context with the existing literature about the two sub-clades.

    We were not aware of the work of Geoghegan et al. (see 2017, PLOS Pathogens), thank you for providing this reference that we will now discuss.

  2. eLife assessment

    Maestri et al report the absence of phylogenetic evidence supporting codiversification of mammalian coronaviruses and their hosts, leading to the important conclusion that the evolutionary history of the virus and its hosts are decoupled through frequent host switches. The evidence for frequent host switching, derived from a probabilistic model of co-evolution, appears convincing, but evidence for quantitative statements about the time of the last common ancestor of extant mammalian coronaviruses remains incomplete. The results would be strengthened by a reconstruction of the evolutionary timescale and further investigation of robustness to sampling biases and unsampled diversity.

  3. Reviewer #1 (Public Review):

    Summary:
    In this study, Maestri et al. use an integrative framework to study the evolutionary history of coronaviruses. They find that coronaviruses arose recently rather than having undergone ancient codivergences with their mammalian hosts. Furthermore, recent host switching has occurred extensively, but typically between closely related species. Humans have acted as an intermediate host, especially between bats and other mammal species.

    Strengths:
    The study draws on a range of data sources to reconstruct the history of virus-host codivergence and host switching. The analyses include various tests of robustness and evaluations through simulation.

    Weaknesses:
    The analyses are limited to a single genetic marker (RdRp) from coronaviruses, but using other sections of the genome might lead to different conclusions. The genetic marker also lacks resolution for recent divergences, which precludes the detailed examination of recent host switches. Careful and detailed reconstruction of the timescale would be helpful for clarifying the evolutionary history of coronaviruses alongside their hosts.

  4. Reviewer #2 (Public Review):

    Summary:
    In their study titled "Recent evolutionary origin and localized diversity hotspots of mammalian coronaviruses," authors Benoît Perez-Lamarque, Renan Maestri, Anna Zhukova, and Hélène Morlon investigate the complex evolutionary history of coronaviruses, particularly those affecting mammals, including humans. The study focuses on unraveling the evolutionary trajectory of these viruses, which have shown a high propensity for causing pandemics, as evidenced by the SARS-CoV2 outbreak.

    The research addresses a significant gap in our understanding of the evolutionary dynamics of coronaviruses, particularly their history, patterns of host-to-host transmission, and geographical spread. These aspects are important for predicting and managing future pandemic scenarios.

    Historically, studies have employed cophylogenetic tests to explore virus-host relationships within the Coronaviridae family, often suggesting a long history of virus-host codiversification spanning millions of years. However, the team led by Perez-Lamarque proposes a novel phylogenetic framework that contrasts this traditional view. Their approach, which involves adapting gene tree-species tree reconciliation, is designed to robustly test the validity of two competing scenarios: an ancient origination and codiversification versus a more recent emergence and diversification through host switching.

    Upon applying this innovative framework to the study of coronaviruses and their mammalian hosts, the authors' findings challenge the prevailing notion of a deep evolutionary history. Instead, their results strongly support a scenario where coronaviruses have a more recent origin, likely in bat populations, followed by diversification predominantly through host-switching events. This diversification, interestingly, seems to occur preferentially within mammalian orders.

    A critical aspect of their findings is the identification of hotspots of coronavirus diversity, particularly in East Asia and Europe. These regions align with the proposed scenario of a relatively recent origin and subsequent localized host-switching events. The study also highlights the rarity of spillovers from bats to other species, yet underscores the relatively higher likelihood of such spillovers occurring towards humans, suggesting a significant role for humans as an intermediate host in the evolutionary journey of these viruses.

    The research also points out the high rates of host-switching within mammalian orders, including between humans, domesticated animals, and non-flying wild mammals.

    In conclusion, the study by Perez-Lamarque and colleagues presents an important quantitative advance in our understanding of the evolutionary history of mammalian coronaviruses. It suggests that the long-held belief in extensive virus-host codiversification may have been substantially overestimated, paving the way for a reevaluation of how we understand, predict, and potentially control the spread of these viruses.

    Strengths:
    The study is conceptually robust, and its conclusions are convincing.

    Weaknesses:
    Despite the availability of a dated host tree the authors were only able to use the "undated" model in ALE, with the dated method (which only allows time-consistent transfers) failing on their dataset (possibly due to dataset size?). Further exploration of the question would be potentially valuable.

  5. Reviewer #3 (Public Review):

    Summary:
    This work uses tools and concepts from co-phylogenetic analyses to reconstruct the evolutionary and diversification history of coronaviruses in mammals. It concludes that cross-species transmissions from bats to humans are a relatively common event (compared to bats to other species). Across all mammals, the diversification history of coronaviruses suggests that there is potential for further evolutionary diversification.

    Strengths:
    The article uses an interesting approach based on jointly looking at the extant network of coronaviruses-mammals interactions, and the phylogenetic history of both these organisms. The authors do an impressive job of explaining the challenges of reconstructing evolutionary dynamics for RNA viruses, and this helps readers appraise the relevance of their approach.

    Weaknesses:
    I remain unconvinced by the argument that sampling does not introduce substantial biases in the analyses. As the authors highlight, incomplete knowledge of the extant interactions would lead to a biased reconstruction of the diversification history. In a recent paper (Poisot et al. 2023, Patterns), we look at sampling biases in the virome of mammals and suggest that is a fairly prominent issue, that is furthermore structured by taxonomy, space, and phylogenetic position. Case in point, even for betacoronaviruses, there have been many newly confirmed hosts in recent years. For organisms that have received less intense scrutiny, I think a thorough discussion of potential gaps in data would be required (see for example Cohen et al. 2022, Nat. Comms).

    I was also surprised to see little discussion of the differences between alpha and beta coronaviruses - there is evidence that they may differ in their cross-species transmission (see Caraballo et al. 2022 Micr. Spectr.), which could call into question the relevance of treating all coronaviruses as a single, homogeneous group.

    Some of the discussions in this paper also echo previous work by e.g. Geoghegan et al. (see 2017, PLOS Pathogens), which I was surprised to not see discussed, as it is a much earlier investigation of the relative frequencies of co-divergence and host switches for different viral families, with a deep discussion of how this may structure future evolutionary dynamics.