Genomic diversity and hotspot mutations in 30,983 SARS-CoV-2 genomes: moving toward a universal vaccine for the “confined virus”?
This article has been Reviewed by the following groups
Listed in
- Evaluated articles (ScreenIT)
Abstract
The COVID-19 pandemic has been ongoing since its onset in late November 2019 in Wuhan, China. Understanding and monitoring the genetic evolution of the virus, its geographical characteristics, and its stability are particularly important for controlling the spread of the disease and especially for the development of a universal vaccine covering all circulating strains. From this perspective, we analyzed 30,983 complete SARS-CoV-2 genomes from 79 countries located in the six continents and collected from December 24, 2019, to May 13, 2020, according to the GISAID database. Our analysis revealed the presence of 3,206 variant sites, with a uniform distribution of mutation types in different geographic areas. Remarkably, a low frequency of recurrent mutations has been observed; only 169 mutations (5.27%) had a prevalence greater than 1% of genomes. Nevertheless, fourteen non-synonymous hotspot mutations (> 10%) have been identified at different locations along the viral genome; eight in ORF1ab polyprotein (in nsp2, nsp3, transmembrane domain, RdRp, helicase, exonuclease, and endoribonuclease), three in nucleocapsid protein and one in each of three proteins: spike, ORF3a, and ORF8. Moreover, 36 non-synonymous mutations were identified in the RBD of the spike protein with a low prevalence (<1%) across all genomes, of which only four could potentially enhance the binding of the SARS-CoV-2 spike protein to the human ACE2 receptor. These results along with mutational frequency dissimilarity and intra-genomic divergence of SARS-CoV-2 could indicate that the SARS-CoV-2 is not yet adapted to its host. Unlike the influenza virus or HIV viruses, the low mutation rate of SARS-CoV-2 makes the development of an effective global vaccine very likely.
Article activity feed
-
-
SciScore for 10.1101/2020.06.20.163188: (What is this?)
Please note, not all rigor criteria are appropriate for all manuscripts.
Table 1: Rigor
NIH rigor criteria are not applicable to paper type.Table 2: Resources
Software and Algorithms Sentences Resources Variant calling analysis: Genome sequences were mapped to the reference sequence Wuhan-Hu-1/2019 (Genbank ID: NC_045512.2) using Minimap v2.12-r847 [34]. Minimapsuggested: NoneThe final sorted BAM files were used to call the genetic variants in variant call format (VCF) by SAMtools mpileup and BCFtools [35]. SAMtoolssuggested: (SAMTOOLS, RRID:SCR_002105)For that, the SnpEff databases were first built locally using annotations of the reference sequence Wuhan-Hu-1/2019 obtained in the GFF format from NCBI database. SnpEffsuggested: (SnpEff, RRID:SCR_005191)NCBIsuggested: (NCBI, RRID:SCR_006472)SciScore for 10.1101/2020.06.20.163188: (What is this?)
Please note, not all rigor criteria are appropriate for all manuscripts.
Table 1: Rigor
NIH rigor criteria are not applicable to paper type.Table 2: Resources
Software and Algorithms Sentences Resources Variant calling analysis: Genome sequences were mapped to the reference sequence Wuhan-Hu-1/2019 (Genbank ID: NC_045512.2) using Minimap v2.12-r847 [34]. Minimapsuggested: NoneThe final sorted BAM files were used to call the genetic variants in variant call format (VCF) by SAMtools mpileup and BCFtools [35]. SAMtoolssuggested: (SAMTOOLS, RRID:SCR_002105)For that, the SnpEff databases were first built locally using annotations of the reference sequence Wuhan-Hu-1/2019 obtained in the GFF format from NCBI database. SnpEffsuggested: (SnpEff, RRID:SCR_005191)NCBIsuggested: (NCBI, RRID:SCR_006472)Comparative analysis of D614 (wild type) and G614 (mutant) interactions with their surrounding residues was done in PyMOL 2.3 (Schrodinger L.L.C). PyMOLsuggested: (PyMOL, RRID:SCR_000305)The tree was constructed in IQ-TREE v1.5.5 [41] using the maximum likelihood method under the GTR model. IQ-TREEsuggested: (IQ-TREE, RRID:SCR_017254)Results from OddPub: We did not detect open data. We also did not detect open code. Researchers are encouraged to share open data when possible (see Nature blog).
Results from LimitationRecognizer: An explicit section about the limitations of the techniques employed in this study was not found. We encourage authors to address study limitations.Results from TrialIdentifier: No clinical trial numbers were referenced.
Results from Barzooka: We did not find any issues relating to the usage of bar graphs.
Results from JetFighter: We did not find any issues relating to colormaps.
Results from rtransparent:- Thank you for including a conflict of interest statement. Authors are encouraged to include this statement when submitting to a journal.
- Thank you for including a funding statement. Authors are encouraged to include this statement when submitting to a journal.
- No protocol registration statement was detected.
-