A novel SARS-CoV-2 related coronavirus in bats from Cambodia

This article has been Reviewed by the following groups

Read the full article See related articles


Knowledge of the origin and reservoir of the coronavirus responsible for the ongoing COVID-19 pandemic is still fragmentary. To date, the closest relatives to SARS-CoV-2 have been detected in Rhinolophus bats sampled in the Yunnan province, China. Here we describe the identification of SARS-CoV-2 related coronaviruses in two Rhinolophus shameli bats sampled in Cambodia in 2010. Metagenomic sequencing identified nearly identical viruses sharing 92.6% nucleotide identity with SARS-CoV-2. Most genomic regions are closely related to SARS-CoV-2, with the exception of a small region corresponding to the spike N terminal domain. The discovery of these viruses in a bat species not found in China indicates that SARS-CoV-2 related viruses have a much wider geographic distribution than previously understood, and suggests that Southeast Asia represents a key area to consider in the ongoing search for the origins of SARS-CoV-2, and in future surveillance for coronaviruses.

Article activity feed

  1. SciScore for 10.1101/2021.01.26.428212: (What is this?)

    Please note, not all rigor criteria are appropriate for all manuscripts.

    Table 1: Rigor

    NIH rigor criteria are not applicable to paper type.

    Table 2: Resources

    Software and Algorithms
    The samples were tested with a pan-coronavirus (pan-CoV) hemi-nested RT-PCR8 and by a RT-qPCR known to detect sarbecoviruses9, including SARS-CoV-2.
    suggested: (Active Motif Cat# 91351, RRID:AB_2847848)
    Genome assembly: Raw reads were trimmed using Trimmomatic v0.39 18 to remove adaptors and low-quality reads.
    suggested: (Trimmomatic, RRID:SCR_011848)
    Scaffolds were queried against the NCBI non-redundant protein database 21 using DIAMOND v2.0.4 22.
    suggested: (DIAMOND, RRID:SCR_009457)
    Aligned reads were manually inspected using Geneious prime v2020.1.2 (2020) (https://www.geneious.com/), and consensus sequences were generated using a minimum of 3X read-depth coverage to make a base call.
    suggested: (Geneious, RRID:SCR_010519)
    7.467 25, and the alignment checked for accuracy using MEGA v7 26.
    suggested: (Mega BLAST, RRID:SCR_011920)
    Prior to the tree reconstruction, the ModelFinder application 31, as implemented in IQ-TREE, was used to select the best-fitting nucleotide substitution model.
    suggested: None
    suggested: (IQ-TREE, RRID:SCR_017254)
    Bayesian phylogenies were inferred using MrBayes v3.2.7 32, using the GTR substitution model.
    suggested: (MrBayes, RRID:SCR_012067)

    Results from OddPub: Thank you for sharing your data.

    Results from LimitationRecognizer: An explicit section about the limitations of the techniques employed in this study was not found. We encourage authors to address study limitations.

    Results from TrialIdentifier: No clinical trial numbers were referenced.

    Results from Barzooka: We did not find any issues relating to the usage of bar graphs.

    Results from JetFighter: We did not find any issues relating to colormaps.

    Results from rtransparent:
    • Thank you for including a conflict of interest statement. Authors are encouraged to include this statement when submitting to a journal.
    • Thank you for including a funding statement. Authors are encouraged to include this statement when submitting to a journal.
    • No protocol registration statement was detected.

    About SciScore

    SciScore is an automated tool that is designed to assist expert reviewers by finding and presenting formulaic information scattered throughout a paper in a standard, easy to digest format. SciScore checks for the presence and correctness of RRIDs (research resource identifiers), and for rigor criteria such as sex and investigator blinding. For details on the theoretical underpinning of rigor criteria and the tools shown here, including references cited, please follow this link.

  2. Our take

    This study, available as a preprint and thus not yet peer reviewed, identified a new lineage of coronaviruses related to SARS-CoV-2 in Rhinolophus shameli bats in Cambodia collected in 2010. The genetic distance between the viruses and SARS-CoV-2 is still too distant to be the direct evolutionary source of SARS-CoV-2. However, the results indicate that the group of viruses that includes SARS-CoV-2 and related viruses in bats is potentially widespread, including China, Japan, and Cambodia.

    Study design


    Study population and setting

    To investigate the animal origins of SARS-CoV-2, research teams retrospectively analyzed a set of archived samples collected from six families of bats in Cambodia in 2010, and between 2012 and 2018. The samples included 162 oral swabs and 268 rectal swabs that were tested for the presence of coronaviruses using real-time PCR (RT-PCR) targeting the RNA-dependent RNA polymerase (RdRp) gene. Samples positive for viruses related to SARS-CoV-2 at RdRP were subjected to metagenomic sequencing to obtain full-length genome sequences. Phylogenetic analyses, identification of recombinant regions, and comparison of the spike receptor binding domain (RBD) between SARS-CoV-2 and the novel viruses were performed using the obtained sequences.

    Summary of main findings

    Out of the 430 swab samples tested, 16 (3.72%) were positive for coronaviruses, 5 of which were betacoronaviruses. Samples positive for betacoronaviruses were tested using another RT-PCR targeting RdRp of sarbecoviruses, the group of coronaviruses that includes SARS-CoV, SARS-CoV-2, and related bat coronaviruses. Two of the 5 samples were positive, collected from Rhinolophus shameli bats in December 2010 in Steung Treng province in northeastern Cambodia. Sequencing produced nearly full-length genomes (RshSTT182 and RshSTT200) that were nearly identical to one another in sequence identity and genome structure. Phylogenetic analysis shows that these two viruses form a new sublineage of SARS-CoV-2-related viruses more closely related to SARS-CoV-2 than coronaviruses identified in pangolins but more distant than two viruses identified in Rhinolophus species in Yunnan province, China (RaTG13 and RmYN02). At the whole genome level, the two new viruses share 92.6% nucleotide identity with SARS-CoV-2. One divergent region of the spike N terminal domain clusters more closely with SARS-related coronaviruses, indicating some history of recombination with other sarbecoviruses in this new lineage. The RBD of RshSTT182 and RshSTT200 are highly similar to SARS-CoV-2, sharing 5/6 key amino acids involved in binding to the ACE2 receptor for entry into human cells. However, the unique polybasic cleavage site within the spike protein of SARS-CoV-2 is absent in the two new viruses.

    Study strengths

    The study benefits from a moderately large sample size of bats from multiple species in Cambodia, a region with a high diversity of Rhinolophus species. The sequencing methods used were appropriate for indicating the unique features of RshSTT182 and RshSTT200 compared to SARS-CoV-2 and other viruses.


    While the viruses detected in Cambodian bats show important similarities with SARS-CoV-2, the viruses are still quite distantly related to SARS-CoV-2 and do not represent the direct evolutionary progenitor of SARS-CoV-2. The study was a retrospective analysis of samples, so the results may not represent the diversity of coronaviruses circulating currently in this region. Analysis of the RBD amino acids is not sufficient to establish that these viruses may enter human or bat cells; additional experiments using cell cultures must be performed to ascertain this. Data on the other bat species or the sample types (oral or fecal) that were positive were reported in the Supplementary Materials, which are not currently available with the preprint.

    Value added

    Although the new viruses do not represent the closest relative of SARS-CoV-2 in bats, this paper does expand the species range and geographic area where SARS-CoV-2-related viruses may be circulating. Similar to SARS-related coronaviruses, it appears that these viruses may be widespread in Rhinolophus bats in East and Southeast Asia and that recombination among virus lineages is facilitated by the overlapping geographic ranges of Rhinolophus species in this region.