Temporal evolution and adaptation of SARS-COV-2 codon usage
This article has been Reviewed by the following groups
Listed in
- Evaluated articles (ScreenIT)
Abstract
The outbreak of severe acute respiratory syndrome-coronavirus-2 (SARS-CoV-2) has caused an unprecedented pandemic. Since the first sequenced whole-genome of SARS-CoV-2 on January 2020, the identification of its genetic variants has become crucial in tracking and evaluating their spread across the globe.
In this study, we compared 134,905 SARS-CoV-2 genomes isolated from all affected countries since the outbreak of this novel coronavirus with the first sequenced genome in Wuhan, China to quantify the evolutionary divergence of SARS-CoV-2. Thus, we compared the codon usage patterns of SARS-CoV-2 genes encoding the membrane protein (M), envelope (E), spike surface glycoprotein (S), nucleoprotein (N), RNA-dependent RNA polymerase (RdRp). The polyproteins ORF1a and ORF1b were examined separately.
We found that SARS-CoV-2 tends to diverge over time by accumulating mutations on its genome and, specifically, on the sequences encoding proteins N and S. Interestingly, different patterns of codon usage were observed among these genes. Genes S and N tend to use a narrower set of synonymous codons that are better optimized to the human host. Conversely, genes E and M consistently use the broader set of synonymous codons, which does not vary in respect to the reference genome. CAI and SiD time evolutions show a tendency to decrease that emerge for most genes. Forsdyke plots are used to study the nature of mutations and they show a rapid evolutionary divergence of each gene, due to the low values of x-intercepets.
Article activity feed
-
-
SciScore for 10.1101/2020.05.29.123976: (What is this?)
Please note, not all rigor criteria are appropriate for all manuscripts.
Table 1: Rigor
NIH rigor criteria are not applicable to paper type.Table 2: Resources
Experimental Models: Organisms/Strains Sentences Resources We downloaded these PPI from NDEx (https://public.ndexbio.org/network/43803262 − 6d69 − 11ea − bfdc − 0ac135e8bacf). https://public.ndexbio.org/network/43803262 − 6d69 − 11ea − bfdc − 0ac135e8bacfsuggested: NoneSoftware and Algorithms Sentences Resources 2.2 Relative Synonymous Codon Usage: RSCU vectors for all the genomes were computed by using an in-house Python script, following the formula: In the RSCUi Xi is the number of occurrences, in a given genome, of codon i, and the sum in the denominator runs over its ni synonymous codons.
Pythonsuggested: (IPython, RRID:SCR_001658)W… SciScore for 10.1101/2020.05.29.123976: (What is this?)
Please note, not all rigor criteria are appropriate for all manuscripts.
Table 1: Rigor
NIH rigor criteria are not applicable to paper type.Table 2: Resources
Experimental Models: Organisms/Strains Sentences Resources We downloaded these PPI from NDEx (https://public.ndexbio.org/network/43803262 − 6d69 − 11ea − bfdc − 0ac135e8bacf). https://public.ndexbio.org/network/43803262 − 6d69 − 11ea − bfdc − 0ac135e8bacfsuggested: NoneSoftware and Algorithms Sentences Resources 2.2 Relative Synonymous Codon Usage: RSCU vectors for all the genomes were computed by using an in-house Python script, following the formula: In the RSCUi Xi is the number of occurrences, in a given genome, of codon i, and the sum in the denominator runs over its ni synonymous codons.
Pythonsuggested: (IPython, RRID:SCR_001658)We then showed the average values of the distance over time with a heatmap, drawn with MATLAB. MATLABsuggested: (MATLAB, RRID:SCR_001622)The protein sequences were aligned using Biopython. Biopythonsuggested: (Biopython, RRID:SCR_007173)To detect communities of PPI, we used the application Molecular Complex Detection (MCODE) [32] in Cytoscape (https://cytoscape.org/). Cytoscapesuggested: (Cytoscape, RRID:SCR_003032)suggested: (CluePedia Cytoscape plugin, RRID:SCR_015784)Results from OddPub: We did not detect open data. We also did not detect open code. Researchers are encouraged to share open data when possible (see Nature blog).
Results from LimitationRecognizer: An explicit section about the limitations of the techniques employed in this study was not found. We encourage authors to address study limitations.Results from TrialIdentifier: No clinical trial numbers were referenced.
Results from Barzooka: We did not find any issues relating to the usage of bar graphs.
Results from JetFighter: We did not find any issues relating to colormaps.
Results from rtransparent:- Thank you for including a conflict of interest statement. Authors are encouraged to include this statement when submitting to a journal.
- No funding statement was detected.
- No protocol registration statement was detected.
-