Analysis of the Dynamics and Distribution of SARS-CoV-2 Mutations and its Possible Structural and Functional Implications
This article has been Reviewed by the following groups
Listed in
- Evaluated articles (ScreenIT)
Abstract
After eight months of the pandemic declaration, COVID-19 has not been globally controlled. Several efforts to control SARS-CoV-2 dissemination are still running including vaccines and drug treatments. The effectiveness of these procedures depends, in part, that the regions to which these treatments are directed do not vary considerably. Although, it is known that the mutation rate of SARS-CoV-2 is relatively low it is necessary to monitor the adaptation and evolution of the virus in the different stages of the pandemic. Thus, identification, analysis of the dynamics, and possible functional and structural implication of mutations are relevant. Here, we first estimate the number of COVID-19 cases with a virus with a specific mutation and then calculate its global relative frequency (NRFp). Using this approach in a dataset of 100 924 genomes from GISAID, we identified 41 mutations to be present in viruses in an estimated number of 750 000 global COVID-19 cases (0.03 NRFp). We classified these mutations into three groups: high-frequent, low-frequent non-synonymous, and low-frequent synonymous. Analysis of the dynamics of these mutations by month and continent showed that high-frequent mutations appeared early in the pandemic, all are present in all continents and some of them are almost fixed in the global population. On the other hand, low-frequent mutations (non-synonymous and synonymous) appear late in the pandemic and seems to be at least partially continent-specific. This could be due to that high-frequent mutation appeared early when lockdown policies had not yet been applied and low-frequent mutations appeared after lockdown policies. Thus, preventing global dissemination of them. Finally, we present a brief structural and functional review of the analyzed ORFs and the possible implications of the 25 identified non-synonymous mutations.
Article activity feed
-
SciScore for 10.1101/2020.11.13.381228: (What is this?)
Please note, not all rigor criteria are appropriate for all manuscripts.
Table 1: Rigor
NIH rigor criteria are not applicable to paper type.Table 2: Resources
Software and Algorithms Sentences Resources Finally, we bound the 8 alignments using cat function in Linux and use this to extract regions corresponding to each of the ORFs and nsp regions of SARS-CoV-2 (regions as annotated in the NCBI database of the Wuhan-Hu-1 reference genome). NCBIsuggested: (NCBI, RRID:SCR_006472)After that, sequences were divided by continent-month combinations, aligned using MAFFT with FFT-NS-2 strategy and default parameter settings (Katoh et al. 2002), columns with more than 98 % gaps were removed and relative frequencies of each base or gap in each position were calculated (RFp,m−c). MAFFTsuggested: …SciScore for 10.1101/2020.11.13.381228: (What is this?)
Please note, not all rigor criteria are appropriate for all manuscripts.
Table 1: Rigor
NIH rigor criteria are not applicable to paper type.Table 2: Resources
Software and Algorithms Sentences Resources Finally, we bound the 8 alignments using cat function in Linux and use this to extract regions corresponding to each of the ORFs and nsp regions of SARS-CoV-2 (regions as annotated in the NCBI database of the Wuhan-Hu-1 reference genome). NCBIsuggested: (NCBI, RRID:SCR_006472)After that, sequences were divided by continent-month combinations, aligned using MAFFT with FFT-NS-2 strategy and default parameter settings (Katoh et al. 2002), columns with more than 98 % gaps were removed and relative frequencies of each base or gap in each position were calculated (RFp,m−c). MAFFTsuggested: (MAFFT, RRID:SCR_011811)The number of cases of each country was obtained from the European Centre for Disease Prevention and Control: https://www.ecdc.europa.eu/en/publications-data/download-todays-data-geographic-distribution-covid-19-cases-worldwide. Controlsuggested: NonePotential energy of mutational and wild-type models was minimized using Gromacs (v.2018.8) (Berendsen et al. 1995). Gromacssuggested: (GROMACS, RRID:SCR_014565)All structural images were produced using ChimeraX (v.1.1) (Pettersen et al. 2020) or Chimera (v.1.15) (Pettersen et al. 2004). ChimeraXsuggested: (UCSF ChimeraX, RRID:SCR_015872)Chimerasuggested: (Chimera, RRID:SCR_002959)2014) with the ggplot2 package (Wickham 2016). ggplot2suggested: (ggplot2, RRID:SCR_014601)Results from OddPub: We did not detect open data. We also did not detect open code. Researchers are encouraged to share open data when possible (see Nature blog).
Results from LimitationRecognizer: An explicit section about the limitations of the techniques employed in this study was not found. We encourage authors to address study limitations.Results from TrialIdentifier: No clinical trial numbers were referenced.
Results from Barzooka: We did not find any issues relating to the usage of bar graphs.
Results from JetFighter: We did not find any issues relating to colormaps.
Results from rtransparent:- Thank you for including a conflict of interest statement. Authors are encouraged to include this statement when submitting to a journal.
- No funding statement was detected.
- No protocol registration statement was detected.
-