Exploring the Natural Origins of SARS-CoV-2 in the Light of Recombination

This article has been Reviewed by the following groups

Read the full article

Abstract

The lack of an identifiable intermediate host species for the proximal animal ancestor of SARS-CoV-2, and the large geographical distance between Wuhan and where the closest evolutionary related coronaviruses circulating in horseshoe bats (members of the Sarbecovirus subgenus) have been identified, is fueling speculation on the natural origins of SARS-CoV-2. We performed a comprehensive phylogenetic study on SARS-CoV-2 and all the related bat and pangolin sarbecoviruses sampled so far. Determining the likely recombination events reveals a highly reticulate evolutionary history within this group of coronaviruses. Distribution of the inferred recombination events is nonrandom with evidence that Spike, the main target for humoral immunity, is beside a recombination hotspot likely driving antigenic shift events in the ancestry of bat sarbecoviruses. Coupled with the geographic ranges of their hosts and the sampling locations, across southern China, and into Southeast Asia, we confirm that horseshoe bats, Rhinolophus, are the likely reservoir species for the SARS-CoV-2 progenitor. By tracing the recombinant sequence patterns, we conclude that there has been relatively recent geographic movement and cocirculation of these viruses’ ancestors, extending across their bat host ranges in China and Southeast Asia over the last 100 years. We confirm that a direct proximal ancestor to SARS-CoV-2 has not yet been sampled, since the closest known relatives collected in Yunnan shared a common ancestor with SARS-CoV-2 approximately 40 years ago. Our analysis highlights the need for dramatically more wildlife sampling to: 1) pinpoint the exact origins of SARS-CoV-2’s animal progenitor, 2) the intermediate species that facilitated transmission from bats to humans (if there is one), and 3) survey the extent of the diversity in the related sarbecoviruses’ phylogeny that present high risk for future spillovers.

Article activity feed

  1. SciScore for 10.1101/2021.01.22.427830: (What is this?)

    Please note, not all rigor criteria are appropriate for all manuscripts.

    Table 1: Rigor

    NIH rigor criteria are not applicable to paper type.

    Table 2: Resources

    Software and Algorithms
    SentencesResources
    The intergenic regions were also aligned separately using MAFFT and all alignments were pieced together into the final whole-genome alignment and visually inspected in Bioedit (Hall and Others 1999).
    MAFFT
    suggested: (MAFFT, RRID:SCR_011811)
    Bioedit
    suggested: (BioEdit, RRID:SCR_007361)
    Phylogenies were visualised using FigTree (http://tree.bio.ed.ac.uk/software/figtree/) and ETE 3 (Huerta-Cepas et al. 2016).
    FigTree
    suggested: (FigTree, RRID:SCR_008515)
    To provide temporal information to the phylogenetic history of the viruses, we performed a Bayesian phylogenetic analysis on non-recombination region 4, using BEAST (Bouckaert et al. 2019).
    BEAST
    suggested: (BEAST, RRID:SCR_010228)

    Results from OddPub: We did not detect open data. We also did not detect open code. Researchers are encouraged to share open data when possible (see Nature blog).


    Results from LimitationRecognizer: An explicit section about the limitations of the techniques employed in this study was not found. We encourage authors to address study limitations.

    Results from TrialIdentifier: No clinical trial numbers were referenced.


    Results from Barzooka: We did not find any issues relating to the usage of bar graphs.


    Results from JetFighter: We did not find any issues relating to colormaps.


    Results from rtransparent:
    • Thank you for including a conflict of interest statement. Authors are encouraged to include this statement when submitting to a journal.
    • No funding statement was detected.
    • No protocol registration statement was detected.

    About SciScore

    SciScore is an automated tool that is designed to assist expert reviewers by finding and presenting formulaic information scattered throughout a paper in a standard, easy to digest format. SciScore checks for the presence and correctness of RRIDs (research resource identifiers), and for rigor criteria such as sex and investigator blinding. For details on the theoretical underpinning of rigor criteria and the tools shown here, including references cited, please follow this link.