Single source of pangolin CoVs with a near identical Spike RBD to SARS-CoV-2

This article has been Reviewed by the following groups

Read the full article

Abstract

Multiple publications have independently described pangolin CoV genomes from the same batch of smuggled pangolins confiscated in Guangdong province in March, 2019. We analyzed the three metagenomic datasets that sampled this batch of pangolins and found that the two complete pangolin CoV genomes, GD_1 by Xiao et al. Nature and MP789 by Liu et al. PLoS Pathogens , were both built primarily using the 2019 dataset first described by Liu et al. Viruses . Other publications, such as Zhang et al. Current Biology and Lam et al. Nature , have also relied on this same dataset by Liu et al. Viruses for their assembly of the Guangdong pangolin CoV sequences and comparisons to SARS-CoV-2. To our knowledge, all of the published pangolin CoV genome sequences that share a highly similar Spike receptor binding domain with SARS-CoV-2 originate from this singular batch of smuggled pangolins. This raises the question of whether pangolins are truly reservoirs or hosts of SARS-CoV-2-related coronaviruses in the wild, or whether the pangolins may have contracted the CoV from another host species during trafficking. Our observations highlight the importance of requiring authors to publish their complete genome assembly pipeline and all contributing raw sequence data, particularly those supporting epidemiological investigations, in order to empower peer review and independent analysis of the sequence data. This is necessary to ensure both the accuracy of the data and the conclusions presented by each publication.

Article activity feed

  1. SciScore for 10.1101/2020.07.07.184374: (What is this?)

    Please note, not all rigor criteria are appropriate for all manuscripts.

    Table 1: Rigor

    NIH rigor criteria are not applicable to paper type.

    Table 2: Resources

    Software and Algorithms
    SentencesResources
    18 Duplicate reads were marked and clean reads were coordinate-sorted using samtools version 1.10 (sub-commands markdup and sort, respectively).19 Read coverage statistics: We computed the following read coverage statistics using bedtools version 2.29.220: (1) percentage breadth of coverage with respect to GD_1; (2) mean depth of coverage with all mapped reads; and (3) mean depth of coverage without duplicate reads.
    samtools
    suggested: (SAMTOOLS, RRID:SCR_002105)
    bedtools
    suggested: (BEDTools, RRID:SCR_006646)
    Read coverage profiles and read alignments were visualized using IGV version 2.8.2.22
    IGV
    suggested: None

    Results from OddPub: We did not detect open data. We also did not detect open code. Researchers are encouraged to share open data when possible (see Nature blog).


    Results from LimitationRecognizer: An explicit section about the limitations of the techniques employed in this study was not found. We encourage authors to address study limitations.

    Results from TrialIdentifier: No clinical trial numbers were referenced.


    Results from Barzooka: We did not find any issues relating to the usage of bar graphs.


    Results from JetFighter: We did not find any issues relating to colormaps.


    Results from rtransparent:
    • Thank you for including a conflict of interest statement. Authors are encouraged to include this statement when submitting to a journal.
    • No funding statement was detected.
    • No protocol registration statement was detected.

    About SciScore

    SciScore is an automated tool that is designed to assist expert reviewers by finding and presenting formulaic information scattered throughout a paper in a standard, easy to digest format. SciScore checks for the presence and correctness of RRIDs (research resource identifiers), and for rigor criteria such as sex and investigator blinding. For details on the theoretical underpinning of rigor criteria and the tools shown here, including references cited, please follow this link.

  2. SciScore for 10.1101/2020.07.07.184374: (What is this?)

    Please note, not all rigor criteria are appropriate for all manuscripts.

    Table 1: Rigor

    Institutional Review Board Statementnot detected.Randomizationnot detected.Blindingnot detected.Power Analysisnot detected.Sex as a biological variablenot detected.

    Table 2: Resources

    Software and Algorithms
    SentencesResources
    NCBI SRA BioProject: PRJNA573298) (Fig 1, Supplementary Fig 1).
    NCBI SRA BioProject
    suggested: (NCBI, SCR_006472)
    We have queried NCBI databases as well as the China National GeneBank Database (https://db.cngb.org/).
    NCBI
    suggested: (NCBI, SCR_006472)
    13 Duplicate reads were marked and clean reads were coordinate-sorted using samtools version 1.10 (subcommands markdup and sort, respectively).14 Read coverage statistics We computed the following read coverage statistics using bedtools version 2.29.215: (1) percentage breadth of coverage with respect to GD_1; (2) mean depth of coverage with all mapped reads; and (3) mean depth of coverage without duplicate reads.
    samtools
    suggested: (Samtools, SCR_002105)
          <div style="margin-bottom:8px">
            <div><b>bedtools</b></div>
            <div>suggested: (BEDTools, <a href="https://scicrunch.org/resources/Any/search?q=SCR_006646">SCR_006646</a>)</div>
          </div>
        </td></tr><tr><td style="min-width:100px;vertical-align:top;border-bottom:1px solid lightgray">Read coverage profiles and read alignments were visualized using IGV version 2.8.2.17 Data Availability The Liu et al.</td><td style="min-width:100px;border-bottom:1px solid lightgray">
          <div style="margin-bottom:8px">
            <div><b>IGV</b></div>
            <div>suggested: None</div>
          </div>
        </td></tr><tr><td style="min-width:100px;vertical-align:top;border-bottom:1px solid lightgray">Viruses data can be found at NCBI SRA BioProject PRJNA573298 (also accession: SRP223042) and Genome Warehouse BioProject PRJCA002224 (also accession: GWHABKW00000000; https://bigd.big.ac.cn/).</td><td style="min-width:100px;border-bottom:1px solid lightgray">
          <div style="margin-bottom:8px">
            <div><b>BioProject</b></div>
            <div>suggested: (NCBI BioProject, <a href="https://scicrunch.org/resources/Any/search?q=SCR_004801">SCR_004801</a>)</div>
          </div>
        </td></tr></table>
    

    Results from LimitationRecognizer: An explicit section about the limitations of the techniques employed in this study was not found. We encourage authors to address study limitations.


    Results from OddPub: Thank you for sharing your data.


    About SciScore

    SciScore is an automated tool that is designed to assist expert reviewers by finding and presenting formulaic information scattered throughout a paper in a standard, easy to digest format. SciScore is not a substitute for expert review. SciScore checks for the presence and correctness of RRIDs (research resource identifiers) in the manuscript, and detects sentences that appear to be missing RRIDs. SciScore also checks to make sure that rigor criteria are addressed by authors. It does this by detecting sentences that discuss criteria such as blinding or power analysis. SciScore does not guarantee that the rigor criteria that it detects are appropriate for the particular study. Instead it assists authors, editors, and reviewers by drawing attention to sections of the manuscript that contain or should contain various rigor criteria and key resources. For details on the results shown here, including references cited, please follow this link.