Phylogenomics and population genomics of SARS-CoV-2 in Mexico during the pre-vaccination stage reveals variants of interest B.1.1.28.4 and B.1.1.222 or B.1.1.519 and the nucleocapsid mutation S194L associated with symptoms

This article has been Reviewed by the following groups

Read the full article See related articles

Abstract

Understanding the evolution of the SARS-CoV-2 virus in various regions of the world during the Covid-19 pandemic is essential to help mitigate the effects of this devastating disease. We describe the phylogenomic and population genetic patterns of the virus in Mexico during the pre-vaccination stage, including asymptomatic carriers. A real-time quantitative PCR screening and phylogenomic reconstructions directed at sequence/structure analysis of the spike glycoprotein revealed mutation of concern E484K in genomes from central Mexico, in addition to the nationwide prevalence of the imported variant 20C/S:452R (B.1.427/9). Overall, the detected variants in Mexico show spike protein mutations in the N-terminal domain (i.e. R190M), in the receptor-binding motif (i.e. T478K, E484K), within the S1–S2 subdomains (i.e. P681R/H, T732A), and at the basis of the protein, V1176F, raising concerns about the lack of phenotypic and clinical data available for the variants of interest we postulate: 20B/478K.V1 (B.1.1.222 or B.1.1.519) and 20B/P.4 (B.1.1.28.4). Moreover, the population patterns of single nucleotide variants from symptomatic and asymptomatic carriers obtained with a self-sampling scheme confirmed the presence of several fixed variants, and differences in allelic frequencies among localities. We identified the mutation N:S194L of the nucleocapsid protein associated with symptomatic patients. Phylogenetically, this mutation is frequent in Mexican sub-clades. Our results highlight the dual and complementary role of spike and nucleocapsid proteins in adaptive evolution of SARS-CoV-2 to their hosts and provide a baseline for specific follow-up of mutations of concern during the vaccination stage.

Article activity feed

  1. SciScore for 10.1101/2021.05.18.21256128: (What is this?)

    Please note, not all rigor criteria are appropriate for all manuscripts.

    Table 1: Rigor

    EthicsIRB: Ethical committee clearance: Ethical committee clearance.
    Consent: An informed written consent for the use of surveillance samples was obtained from all patients.
    Sex as a biological variablenot detected.
    Randomizationnot detected.
    Blindingnot detected.
    Power Analysisnot detected.

    Table 2: Resources

    Software and Algorithms
    SentencesResources
    Detection of San Luis Potosí SARS-CoV2 positive samples was done with the GeneFinder COVID-19 PLUS RealAmp Kit.
    GeneFinder
    suggested: (GENEFINDER, RRID:SCR_009190)
    We identified 337 combinations of mutations and 315 unique mutations from 1552 (two sequences were filtered out because of quality issues) sequences using in-house Perl and Python scripts.
    Python
    suggested: (IPython, RRID:SCR_001658)
    We transformed the output file to study the incidence for 315 mutations, we grouped them in 11 clades, and we studied their covariances between one another, applying in-house scripts with R packages: tidyverse (Wickham et al. 2019), circlize (Gu et al. 2014), and Python modules: NumPy (Harris et al. 2020), Pandas (McKinney et al. 2010), matplotlib (Hunter, 2007), seaborn (Waskom et al. 2017)
    NumPy
    suggested: (NumPy, RRID:SCR_008633)
    matplotlib
    suggested: (MatPlotLib, RRID:SCR_008624)
    They were then mapped to the NC_045512.2 version of the SARS-CoV-2 reference genome using BWA (Li & Durbin 2009) with default parameters.
    BWA
    suggested: (BWA, RRID:SCR_010910)
    Sam alignments were then converted to bam files and sorted using samtools (Li et al. 2009).
    samtools
    suggested: (SAMTOOLS, RRID:SCR_002105)
    (Schrodinger LLC, http://www.pymol.org).
    http://www.pymol.org
    suggested: (PyMOL, RRID:SCR_000305)

    Results from OddPub: Thank you for sharing your code and data.


    Results from LimitationRecognizer: An explicit section about the limitations of the techniques employed in this study was not found. We encourage authors to address study limitations.

    Results from TrialIdentifier: No clinical trial numbers were referenced.


    Results from Barzooka: We did not find any issues relating to the usage of bar graphs.


    Results from JetFighter: We did not find any issues relating to colormaps.


    Results from rtransparent:
    • Thank you for including a conflict of interest statement. Authors are encouraged to include this statement when submitting to a journal.
    • Thank you for including a funding statement. Authors are encouraged to include this statement when submitting to a journal.
    • No protocol registration statement was detected.

    Results from scite Reference Check: We found no unreliable references.


    About SciScore

    SciScore is an automated tool that is designed to assist expert reviewers by finding and presenting formulaic information scattered throughout a paper in a standard, easy to digest format. SciScore checks for the presence and correctness of RRIDs (research resource identifiers), and for rigor criteria such as sex and investigator blinding. For details on the theoretical underpinning of rigor criteria and the tools shown here, including references cited, please follow this link.