Compositional Variability and Mutation Spectra of Monophyletic SARS-CoV-2 Clades
This article has been Reviewed by the following groups
Listed in
- Evaluated articles (ScreenIT)
Abstract
COVID-19 and its causative pathogen SARS-CoV-2 have rushed the world into a staggering pandemic in a few months, and a global fight against both has been intensifying. Here, we describe an analysis procedure where genome composition and its variables are related, through the genetic code to molecular mechanisms, based on understanding of RNA replication and its feedback loop from mutation to viral proteome sequence fraternity including effective sites on the replicase-transcriptase complex. Our analysis starts with primary sequence information, identity-based phylogeny based on 22,051 SARS-CoV-2 sequences, and evaluation of sequence variation patterns as mutation spectra and its 12 permutations among organized clades. All are tailored to two key mechanisms: strand-biased and function-associated mutations. Our findings are listed as follows: 1) The most dominant mutation is C-to-U permutation, whose abundant second-codon-position counts alter amino acid composition toward higher molecular weight and lower hydrophobicity, albeit assumed most slightly deleterious. 2) The second abundance group includes three negative-strand mutations (U-to-C, A-to-G, and G-to-A) and a positive-strand mutation (G-to-U) due to DNA repair mechanisms after cellular abasic events. 3) A clade-associated biased mutation trend is found attributable to elevated level of negative-sense strand synthesis. 4) Within-clade permutation variation is very informative for associating non-synonymous mutations and viral proteome changes. These findings demand a platform where emerging mutations are mapped onto mostly subtle but fast-adjusting viral proteomes and transcriptomes, to provide biological and clinical information after logical convergence for effective pharmaceutical and diagnostic applications. Such actions are in desperate need, especially in the middle of the War against COVID-19.
Article activity feed
-
-
SciScore for 10.1101/2020.08.26.267781: (What is this?)
Please note, not all rigor criteria are appropriate for all manuscripts.
Table 1: Rigor
NIH rigor criteria are not applicable to paper type.Table 2: Resources
Software and Algorithms Sentences Resources Further analyses of SARS-CoV-2 and related CoV genomes are referenced to genome annotation of the same reference genome (NC_045512.2) and other information provided by the RefSeq database at NCBI. RefSeqsuggested: (RefSeq, RRID:SCR_003496)The sequences were aligned by using MUSCLE and the UPGMA tree was constructed by using MEGA-X [53]. MUSCLEsuggested: (MUSCLE, RRID:SCR_011812)FastTree (version 2.1.11) [56] is used to construct maximum likelihood phylogeny based on 5,121 genomes that have met our criteria, and iTol [57], an interactive web server was employed for setting an unrooted format … SciScore for 10.1101/2020.08.26.267781: (What is this?)
Please note, not all rigor criteria are appropriate for all manuscripts.
Table 1: Rigor
NIH rigor criteria are not applicable to paper type.Table 2: Resources
Software and Algorithms Sentences Resources Further analyses of SARS-CoV-2 and related CoV genomes are referenced to genome annotation of the same reference genome (NC_045512.2) and other information provided by the RefSeq database at NCBI. RefSeqsuggested: (RefSeq, RRID:SCR_003496)The sequences were aligned by using MUSCLE and the UPGMA tree was constructed by using MEGA-X [53]. MUSCLEsuggested: (MUSCLE, RRID:SCR_011812)FastTree (version 2.1.11) [56] is used to construct maximum likelihood phylogeny based on 5,121 genomes that have met our criteria, and iTol [57], an interactive web server was employed for setting an unrooted format and annotating samples. FastTreesuggested: (FastTree, RRID:SCR_015501)Results from OddPub: We did not detect open data. We also did not detect open code. Researchers are encouraged to share open data when possible (see Nature blog).
Results from LimitationRecognizer: An explicit section about the limitations of the techniques employed in this study was not found. We encourage authors to address study limitations.Results from TrialIdentifier: No clinical trial numbers were referenced.
Results from Barzooka: We did not find any issues relating to the usage of bar graphs.
Results from JetFighter: We did not find any issues relating to colormaps.
Results from rtransparent:- Thank you for including a conflict of interest statement. Authors are encouraged to include this statement when submitting to a journal.
- Thank you for including a funding statement. Authors are encouraged to include this statement when submitting to a journal.
- No protocol registration statement was detected.
-
-