Insights on early mutational events in SARS-CoV-2 virus reveal founder effects across geographical regions
This article has been Reviewed by the following groups
Listed in
- Evaluated articles (PeerJ)
- Evaluated articles (ScreenIT)
Abstract
Here we aim to describe early mutational events across samples from publicly available SARS-CoV-2 sequences from the sequence read archive and GenBank repositories. Up until 27 March 2020, we downloaded 50 illumina datasets, mostly from China, USA (WA State) and Australia (VIC). A total of 30 datasets (60%) contain at least a single founder mutation and most of the variants are missense (over 63%). Five-point mutations with clonal (founder) effect were found in USA next-generation sequencing samples. Sequencing samples from North America in GenBank (22 April 2020) present this signature with up to 39% allele frequencies among samples ( n = 1,359). Australian variant signatures were more diverse than USA samples, but still, clonal events were found in these samples. Mutations in the helicase, encoded by the ORF1ab gene in SARS-CoV-2 were predominant, among others, suggesting that these regions are actively evolving. Finally, we firmly urge that primer sets for diagnosis be carefully designed, since rapidly occurring variants would affect the performance of the reverse transcribed quantitative PCR (RT-qPCR) based viral testing.
Article activity feed
-
SciScore for 10.1101/2020.04.09.034462: (What is this?)
Please note, not all rigor criteria are appropriate for all manuscripts.
Table 1: Rigor
Institutional Review Board Statement not detected. Randomization not detected. Blinding not detected. Power Analysis not detected. Sex as a biological variable not detected. Table 2: Resources
Software and Algorithms Sentences Resources Data Collection: Raw illumina sequencing data were downloaded from the following NCBI SRA BioProjects: SRA: PRJNA601736 (Chinese datasets), SRA: PRJNA603194 (Chinese dataset) (Wu et al. 2020b), SRA: PRJNA605907 (Chinese datasets) (Shen et al. 2020), SRA: PRJNA607948 (USA-Wisconsin datasets), SRA: PRJNA608651 (Nepal dataset), SRA: PRJNA610428 (USA-Washington datasets), SRA: PRJNA612578 (USA-San-Diego dataset), SRA: PRJNA231221 (USA-Washington dataset) (Sichtig et al. 2019), SRA: … SciScore for 10.1101/2020.04.09.034462: (What is this?)
Please note, not all rigor criteria are appropriate for all manuscripts.
Table 1: Rigor
Institutional Review Board Statement not detected. Randomization not detected. Blinding not detected. Power Analysis not detected. Sex as a biological variable not detected. Table 2: Resources
Software and Algorithms Sentences Resources Data Collection: Raw illumina sequencing data were downloaded from the following NCBI SRA BioProjects: SRA: PRJNA601736 (Chinese datasets), SRA: PRJNA603194 (Chinese dataset) (Wu et al. 2020b), SRA: PRJNA605907 (Chinese datasets) (Shen et al. 2020), SRA: PRJNA607948 (USA-Wisconsin datasets), SRA: PRJNA608651 (Nepal dataset), SRA: PRJNA610428 (USA-Washington datasets), SRA: PRJNA612578 (USA-San-Diego dataset), SRA: PRJNA231221 (USA-Washington dataset) (Sichtig et al. 2019), SRA: PRJNA613958 (Australian-Victoria datasets), SRA: PRJNA231221 (USA-Maryland dataset), and SRA: PRJNA614995 (USA-Utah datasets). NCBI SRA BioProjectssuggested: NoneData processing: Raw reads were aligned with bowtie2 aligner (v2.2.6) (Langmead & Salzberg 2012) against SARS-CoV-2 reference genome NC_045512.2 (https://www.ncbi.nlm.nih.gov/nuccore/NC_045512), using the following parameters: -D 20 -R 3 -N 0 -L 20 -i S,1,0.50. bowtie2suggested: (Bowtie 2, RRID:SCR_016368)Samtools v1.9 (using htslib v1.9) (Li et al. 2009) was used to sort sam files, remove duplicate reads and index bam files. bcftools v1.9 (part of the samtools framework) was used to obtain depth of coverage in each aligned sample. Samtoolssuggested: (SAMTOOLS, RRID:SCR_002105)Results from OddPub: Thank you for sharing your code and data.
Results from LimitationRecognizer: An explicit section about the limitations of the techniques employed in this study was not found. We encourage authors to address study limitations.Results from TrialIdentifier: No clinical trial numbers were referenced.
Results from Barzooka: We did not find any issues relating to the usage of bar graphs.
Results from JetFighter: We did not find any issues relating to colormaps.
Results from rtransparent:- Thank you for including a conflict of interest statement. Authors are encouraged to include this statement when submitting to a journal.
- Thank you for including a funding statement. Authors are encouraged to include this statement when submitting to a journal.
- No protocol registration statement was detected.
-
-
-
-
-