Identification of a pangolin niche for a 2019-nCoV-like coronavirus through an extensive meta-metagenomic search
This article has been Reviewed by the following groups
Listed in
- Evaluated articles (ScreenIT)
Abstract
In numerous instances, tracking the biological significance of a nucleic acid sequence can be augmented through the identification of environmental niches in which the sequence of interest is present. Many metagenomic datasets are now available, with deep sequencing of samples from diverse biological niches. While any individual metagenomic dataset can be readily queried using web-based tools, meta-searches through all such datasets are less accessible. In this brief communication, we demonstrate such a meta-meta-genomic approach, examining close matches to the Wuhan coronavirus 2019-nCoV in all high-throughput sequencing datasets in the NCBI Sequence Read Archive accessible with the keyword "virome". In addition to the homology to bat coronaviruses observed in descriptions of the 2019-nCoV sequence (F. Wu et al. 2020, Nature, doi.org/10.1038/s41586-020-2008-3; P. Zhou et al. 2020, Nature, doi.org/10.1038/s41586-020-2012-7), we note a strong homology to numerous sequence reads in a metavirome dataset generated from the lungs of deceased Pangolins reported by Liu et al. (Viruses 11:11, 2019, http://doi.org/10.3390/v11110979). Our observations are relevant to discussions of the derivation of 2019-nCoV and illustrate the utility and limitations of meta-metagenomic search tools in effective and rapid characterization of potentially significant nucleic acid sequences.
Article activity feed
-
SciScore for 10.1101/2020.02.08.939660: (What is this?)
Please note, not all rigor criteria are appropriate for all manuscripts.
Table 1: Rigor
NIH rigor criteria are not applicable to paper type.Table 2: Resources
Software and Algorithms Sentences Resources Sequence data: All sequence data for this analysis were downloaded from the National Center for Biotechnology Information (NCBI) website, with individual sequences downloaded through a web interface and metagenomic datasets downloaded from the NCBI Sequence Read Archive (SRA) using the SRA-tools package (version 2.9.1). NCBI Sequence Read Archivesuggested: (NCBI Sequence Read Archive (SRA, RRID:SCR_004891)This was implemented in a Python script run using the PyPy accelerated interpreter. Pythonsuggested: (IPython, RRID:SCR_001658)Assessment of nucleotide similarity between 2019-nCoV, … SciScore for 10.1101/2020.02.08.939660: (What is this?)
Please note, not all rigor criteria are appropriate for all manuscripts.
Table 1: Rigor
NIH rigor criteria are not applicable to paper type.Table 2: Resources
Software and Algorithms Sentences Resources Sequence data: All sequence data for this analysis were downloaded from the National Center for Biotechnology Information (NCBI) website, with individual sequences downloaded through a web interface and metagenomic datasets downloaded from the NCBI Sequence Read Archive (SRA) using the SRA-tools package (version 2.9.1). NCBI Sequence Read Archivesuggested: (NCBI Sequence Read Archive (SRA, RRID:SCR_004891)This was implemented in a Python script run using the PyPy accelerated interpreter. Pythonsuggested: (IPython, RRID:SCR_001658)Assessment of nucleotide similarity between 2019-nCoV, pangolin metavirome reads, and closely related bat coronaviruses: All pangolin metavirome reads that aligned to the 2019-nCoV genome with BWA-MEM after adapter trimming with cutadapt were used for calculation. BWA-MEMsuggested: (Sniffles, RRID:SCR_017619)The bat coronavirus genomes were aligned to the 2019-nCoV genome in a multiple sequence alignment using the web-interface for Clustal Omega (https://www.ebi.ac.uk/Tools/msa/clustalo/) (6) with default settings. Clustal Omegasuggested: (Clustal Omega, RRID:SCR_001591)Together the datasets included information from 9014 NCBI Short Read Archive entries with (in total) 6.2*1010 individual reads and 8.4*1012 base pairs. Short Read Archivesuggested: NoneResults from OddPub: Thank you for sharing your code.
Results from LimitationRecognizer: An explicit section about the limitations of the techniques employed in this study was not found. We encourage authors to address study limitations.Results from TrialIdentifier: No clinical trial numbers were referenced.
Results from Barzooka: We did not find any issues relating to the usage of bar graphs.
Results from JetFighter: Please consider improving the rainbow (“jet”) colormap(s) used on page 10. At least one figure is not accessible to readers with colorblindness and/or is not true to the data, i.e. not perceptually uniform.
Results from rtransparent:- Thank you for including a conflict of interest statement. Authors are encouraged to include this statement when submitting to a journal.
- No funding statement was detected.
- No protocol registration statement was detected.
-
-