Phylogenomics and population genomics of SARS-CoV-2 in Mexico during the pre-vaccination stage reveals variants of interest B.1.1.28.4 and B.1.1.222 or B.1.1.519 and the nucleocapsid mutation S194L associated with symptoms
This article has been Reviewed by the following groups
Discuss this preprint
Start a discussion What are Sciety discussions?Listed in
- Evaluated articles (ScreenIT)
Abstract
Understanding the evolution of the SARS-CoV-2 virus in various regions of the world during the Covid-19 pandemic is essential to help mitigate the effects of this devastating disease. We describe the phylogenomic and population genetic patterns of the virus in Mexico during the pre-vaccination stage, including asymptomatic carriers. A real-time quantitative PCR screening and phylogenomic reconstructions directed at sequence/structure analysis of the spike glycoprotein revealed mutation of concern E484K in genomes from central Mexico, in addition to the nationwide prevalence of the imported variant 20C/S:452R (B.1.427/9). Overall, the detected variants in Mexico show spike protein mutations in the N-terminal domain (i.e. R190M), in the receptor-binding motif (i.e. T478K, E484K), within the S1–S2 subdomains (i.e. P681R/H, T732A), and at the basis of the protein, V1176F, raising concerns about the lack of phenotypic and clinical data available for the variants of interest we postulate: 20B/478K.V1 (B.1.1.222 or B.1.1.519) and 20B/P.4 (B.1.1.28.4). Moreover, the population patterns of single nucleotide variants from symptomatic and asymptomatic carriers obtained with a self-sampling scheme confirmed the presence of several fixed variants, and differences in allelic frequencies among localities. We identified the mutation N:S194L of the nucleocapsid protein associated with symptomatic patients. Phylogenetically, this mutation is frequent in Mexican sub-clades. Our results highlight the dual and complementary role of spike and nucleocapsid proteins in adaptive evolution of SARS-CoV-2 to their hosts and provide a baseline for specific follow-up of mutations of concern during the vaccination stage.
Article activity feed
-
-
SciScore for 10.1101/2021.05.18.21256128: (What is this?)
Please note, not all rigor criteria are appropriate for all manuscripts.
Table 1: Rigor
Ethics IRB: Ethical committee clearance: Ethical committee clearance.
Consent: An informed written consent for the use of surveillance samples was obtained from all patients.Sex as a biological variable not detected. Randomization not detected. Blinding not detected. Power Analysis not detected. Table 2: Resources
Software and Algorithms Sentences Resources Detection of San Luis Potosí SARS-CoV2 positive samples was done with the GeneFinder COVID-19 PLUS RealAmp Kit. GeneFindersuggested: (GENEFINDER, RRID:SCR_009190)We identified 337 combinations of mutations and 315 unique mutations from 1552 (two sequences were filtered out because of quality issues) sequences using in-house Perl and Python scripts. Pythonsugg…SciScore for 10.1101/2021.05.18.21256128: (What is this?)
Please note, not all rigor criteria are appropriate for all manuscripts.
Table 1: Rigor
Ethics IRB: Ethical committee clearance: Ethical committee clearance.
Consent: An informed written consent for the use of surveillance samples was obtained from all patients.Sex as a biological variable not detected. Randomization not detected. Blinding not detected. Power Analysis not detected. Table 2: Resources
Software and Algorithms Sentences Resources Detection of San Luis Potosí SARS-CoV2 positive samples was done with the GeneFinder COVID-19 PLUS RealAmp Kit. GeneFindersuggested: (GENEFINDER, RRID:SCR_009190)We identified 337 combinations of mutations and 315 unique mutations from 1552 (two sequences were filtered out because of quality issues) sequences using in-house Perl and Python scripts. Pythonsuggested: (IPython, RRID:SCR_001658)We transformed the output file to study the incidence for 315 mutations, we grouped them in 11 clades, and we studied their covariances between one another, applying in-house scripts with R packages: tidyverse (Wickham et al. 2019), circlize (Gu et al. 2014), and Python modules: NumPy (Harris et al. 2020), Pandas (McKinney et al. 2010), matplotlib (Hunter, 2007), seaborn (Waskom et al. 2017) NumPysuggested: (NumPy, RRID:SCR_008633)matplotlibsuggested: (MatPlotLib, RRID:SCR_008624)They were then mapped to the NC_045512.2 version of the SARS-CoV-2 reference genome using BWA (Li & Durbin 2009) with default parameters. BWAsuggested: (BWA, RRID:SCR_010910)Sam alignments were then converted to bam files and sorted using samtools (Li et al. 2009). samtoolssuggested: (SAMTOOLS, RRID:SCR_002105)(Schrodinger LLC, http://www.pymol.org). http://www.pymol.orgsuggested: (PyMOL, RRID:SCR_000305)Results from OddPub: Thank you for sharing your code and data.
Results from LimitationRecognizer: An explicit section about the limitations of the techniques employed in this study was not found. We encourage authors to address study limitations.Results from TrialIdentifier: No clinical trial numbers were referenced.
Results from Barzooka: We did not find any issues relating to the usage of bar graphs.
Results from JetFighter: We did not find any issues relating to colormaps.
Results from rtransparent:- Thank you for including a conflict of interest statement. Authors are encouraged to include this statement when submitting to a journal.
- Thank you for including a funding statement. Authors are encouraged to include this statement when submitting to a journal.
- No protocol registration statement was detected.
Results from scite Reference Check: We found no unreliable references.
-
