Characterization of the substitution hotspots in SARS-CoV-2 genome using BioAider and detection of a SR-rich region in N protein providing further evidence of its animal origin
This article has been Reviewed by the following groups
Listed in
- Evaluated articles (ScreenIT)
Abstract
The novel human coronavirus (SARS-CoV-2) causes the coronavirus disease 2019 (COVID-19) pandemic worldwide. The increasing sequencing data have shown abundant single nucleotide variations in SARS-CoV-2 genome. However, it is difficult to quickly analyze genomic variation and screen key mutations of SARS-CoV-2. In this study, we developed a visual program, named BioAider, for quick and convenient sequence annotation and mutation analysis on multiple genome-sequencing data. Using BioAider, we conducted a comprehensive genome variation analysis on 3,240 sequences of SARS-CoV-2 genome. Herein, we detected 14 substitution hotspots within SARS-CoV-2 genome, including 10 non-synonymous and 4 synonymous ones. Among these hotspots, NSP13-Y541C was predicted to be a crucial substitution which might affect the unwinding activity of NSP13, a key protein for viral replication. Besides, we also found 3 groups of potentially linked substitution hotspots which were worth further study. In particular, we discovered a SR-rich region (aa 184-204) on the N protein of SARS-CoV-2 distinct from SARS-CoV, indicating more complex replication mechanism and unique N-M interaction of SARS-CoV-2. Interestingly, the quantity of SRXX repeat fragments in the SR-rich region well reflected the evolutionary relationship among SARS-CoV-2 and SARS-CoV-2 related animal coronaviruses, providing further evidence of its animal origin. Overall, we developed an efficient tool for rapid identification of mutations, identified substitution hotspots in SARS-CoV-2 genomes, and detected a distinctive polymorphism SR-rich region in N protein. This tool and the detected hotspots could facilitate the viral genomic study and may contribute for screening antiviral target sites.
Article activity feed
-
SciScore for 10.1101/2020.06.04.135293: (What is this?)
Please note, not all rigor criteria are appropriate for all manuscripts.
Table 1: Rigor
NIH rigor criteria are not applicable to paper type.Table 2: Resources
Software and Algorithms Sentences Resources Main functions and working principle of BioAider: BioAider V1.0 was developed based on Python 3.7 and R 3.5.2, and used PyQt5 for interface packaging. BioAidersuggested: NonePythonsuggested: (IPython, RRID:SCR_001658)Multiple sequence alignment of genomic sequences of SARS-CoV-2 were accomplished using MAFFT v7.407 [40]. MAFFTsuggested: (MAFFT, RRID:SCR_011811)Then we used MUSCLE program in MEGA v7.0.14 to align these coding genes based on codons method [41]. MUSCLEsuggested: (MUSCLE, RRID:SCR_011812)MEGAsuggested: (Mega BLAST, RRID:SCR_011920)Results from OddPub: We did not detect open …
SciScore for 10.1101/2020.06.04.135293: (What is this?)
Please note, not all rigor criteria are appropriate for all manuscripts.
Table 1: Rigor
NIH rigor criteria are not applicable to paper type.Table 2: Resources
Software and Algorithms Sentences Resources Main functions and working principle of BioAider: BioAider V1.0 was developed based on Python 3.7 and R 3.5.2, and used PyQt5 for interface packaging. BioAidersuggested: NonePythonsuggested: (IPython, RRID:SCR_001658)Multiple sequence alignment of genomic sequences of SARS-CoV-2 were accomplished using MAFFT v7.407 [40]. MAFFTsuggested: (MAFFT, RRID:SCR_011811)Then we used MUSCLE program in MEGA v7.0.14 to align these coding genes based on codons method [41]. MUSCLEsuggested: (MUSCLE, RRID:SCR_011812)MEGAsuggested: (Mega BLAST, RRID:SCR_011920)Results from OddPub: We did not detect open data. We also did not detect open code. Researchers are encouraged to share open data when possible (see Nature blog).
Results from LimitationRecognizer: An explicit section about the limitations of the techniques employed in this study was not found. We encourage authors to address study limitations.Results from TrialIdentifier: No clinical trial numbers were referenced.
Results from Barzooka: We did not find any issues relating to the usage of bar graphs.
Results from JetFighter: Please consider improving the rainbow (“jet”) colormap(s) used on pages 29 and 30. At least one figure is not accessible to readers with colorblindness and/or is not true to the data, i.e. not perceptually uniform.
Results from rtransparent:- Thank you for including a conflict of interest statement. Authors are encouraged to include this statement when submitting to a journal.
- No funding statement was detected.
- No protocol registration statement was detected.
-