Evolutionary and structural analyses of SARS-CoV-2 D614G spike protein mutation now documented worldwide

This article has been Reviewed by the following groups

Read the full article See related articles

Abstract

The COVID-19 pandemic, caused by the Severe Acute Respiratory Syndrome Coronavirus 2 (SARS-CoV-2), was declared on March 11, 2020 by the World Health Organization. As of the 31st of May, 2020, there have been more than 6 million COVID-19 cases diagnosed worldwide and over 370,000 deaths, according to Johns Hopkins. Thousands of SARS-CoV-2 strains have been sequenced to date, providing a valuable opportunity to investigate the evolution of the virus on a global scale. We performed a phylogenetic analysis of over 1,225 SARS-CoV-2 genomes spanning from late December 2019 to mid-March 2020. We identified a missense mutation, D614G, in the spike protein of SARS-CoV-2, which has emerged as a predominant clade in Europe (954 of 1,449 (66%) sequences) and is spreading worldwide (1,237 of 2,795 (44%) sequences). Molecular dating analysis estimated the emergence of this clade around mid-to-late January (10–25 January) 2020. We also applied structural bioinformatics to assess the potential impact of D614G on the virulence and epidemiology of SARS-CoV-2. In silico analyses on the spike protein structure suggests that the mutation is most likely neutral to protein function as it relates to its interaction with the human ACE2 receptor. The lack of clinical metadata available prevented our investigation of association between viral clade and disease severity phenotype. Future work that can leverage clinical outcome data with both viral and human genomic diversity is needed to monitor the pandemic.

Article activity feed

  1. SciScore for 10.1101/2020.06.08.140459: (What is this?)

    Please note, not all rigor criteria are appropriate for all manuscripts.

    Table 1: Rigor

    NIH rigor criteria are not applicable to paper type.

    Table 2: Resources

    Software and Algorithms
    SentencesResources
    Spearman’s rank correlation analysis was performed using GraphPad prism (v 8.3.1).
    GraphPad
    suggested: (GraphPad Prism, RRID:SCR_002798)
    The resulting 2,017 curated sequences were aligned using mafft30 (v7.450: --retree 2 –maxiterate 2 –auto settings), and unaligned ends and gap positions from the resulting alignment were removed using Geneious (version R10.2.6, https://www.geneious.com).
    Geneious
    suggested: (Geneious, RRID:SCR_010519)
    Identified positions were masked using a custom-written python script, and further filtered for sequences with 100% nucleotide identity, resulting in an alignment of 1,225 sequences.
    python
    suggested: (IPython, RRID:SCR_001658)
    Molecular dating analysis was performed by the Markov chain Monte Carlo (MCMC) method implemented by Bayesian Evolutionary Analysis on Sampling Trees (BEAST) 1.10.435.
    BEAST
    suggested: (BEAST, RRID:SCR_010228)
    The resulting coalescent tree was generated using TreeAnnotator37 and visualized using ggtree package38 in R version 3.6.1 (https://www.r-project.org/).
    https://www.r-project.org/
    suggested: (R Project for Statistical Computing, RRID:SCR_001905)

    Results from OddPub: Thank you for sharing your data.


    Results from LimitationRecognizer: An explicit section about the limitations of the techniques employed in this study was not found. We encourage authors to address study limitations.

    Results from TrialIdentifier: No clinical trial numbers were referenced.


    Results from Barzooka: We did not find any issues relating to the usage of bar graphs.


    Results from JetFighter: We did not find any issues relating to colormaps.


    Results from rtransparent:
    • Thank you for including a conflict of interest statement. Authors are encouraged to include this statement when submitting to a journal.
    • Thank you for including a funding statement. Authors are encouraged to include this statement when submitting to a journal.
    • No protocol registration statement was detected.

    About SciScore

    SciScore is an automated tool that is designed to assist expert reviewers by finding and presenting formulaic information scattered throughout a paper in a standard, easy to digest format. SciScore checks for the presence and correctness of RRIDs (research resource identifiers), and for rigor criteria such as sex and investigator blinding. For details on the theoretical underpinning of rigor criteria and the tools shown here, including references cited, please follow this link.