Two mutations in the SARS-CoV-2 spike protein and RNA polymerase complex are associated with COVID-19 mortality risk

This article has been Reviewed by the following groups

Read the full article

Abstract

Background

SARS-CoV-2 mortality has been extensively studied in relation to host susceptibility. How sequence variations in the SARS-CoV-2 genome affect pathogenicity is poorly understood. Whole-genome sequencing (WGS) of the virus with death in SARS-CoV-2 patients is one potential method of early identification of highly pathogenic strains to target for containment.

Methods

We analyzed 7,548 single stranded RNA-genomes of SARS-CoV-2 patients in the GISAID database (Elbe and Buckland-Merrett, 2017; Shu and McCauley, 2017) and associated variants with reported patient’s health status from COVID-19, i.e. deceased versus non-deceased. We probed each locus of the single stranded RNA of the SARS-CoV-2 virus for direct association with host/patient mortality using a logistic regression.

Results

In total, evaluating 29,891 loci of the viral genome for association with patient/host mortality, two loci, at 12,053bp and 25,088bp, achieved genome-wide significance (p-values of 4.09e-09 and 4.41e-23, respectively).

Conclusions

Mutations at 25,088bp occur in the S2 subunit of the SARS-CoV-2 spike protein, which plays a key role in viral entry of target host cells. Additionally, mutations at 12,053bp are within the ORF1ab gene, in a region encoding for the protein nsp7, which is necessary to form the RNA polymerase complex responsible for viral replication and transcription. Both mutations altered amino acid coding sequences, potentially imposing structural changes that could enhance viral infectivity and symptom severity, and may be important to consider as targets for therapeutic development. Identification of these highly significant associations, unlikely to occur by chance, may assist with COVID-19 early containment of strains that are potentially highly pathogenic.

Article activity feed

  1. SciScore for 10.1101/2020.11.17.386714: (What is this?)

    Please note, not all rigor criteria are appropriate for all manuscripts.

    Table 1: Rigor

    NIH rigor criteria are not applicable to paper type.

    Table 2: Resources

    Software and Algorithms
    SentencesResources
    Data cleaning: We filtered the 8,647 samples for complete nucleotide sequences, and aligned them to the SARS-CoV-2 reference sequence (published on GISAID under the accession number EPI_ISL_402124) using MAFFT (Katoh et al., 2002).
    MAFFT
    suggested: (MAFFT, RRID:SCR_011811)

    Results from OddPub: We did not detect open data. We also did not detect open code. Researchers are encouraged to share open data when possible (see Nature blog).


    Results from LimitationRecognizer: An explicit section about the limitations of the techniques employed in this study was not found. We encourage authors to address study limitations.

    Results from TrialIdentifier: No clinical trial numbers were referenced.


    Results from Barzooka: We did not find any issues relating to the usage of bar graphs.


    Results from JetFighter: We did not find any issues relating to colormaps.


    Results from rtransparent:
    • Thank you for including a conflict of interest statement. Authors are encouraged to include this statement when submitting to a journal.
    • No funding statement was detected.
    • No protocol registration statement was detected.

    About SciScore

    SciScore is an automated tool that is designed to assist expert reviewers by finding and presenting formulaic information scattered throughout a paper in a standard, easy to digest format. SciScore checks for the presence and correctness of RRIDs (research resource identifiers), and for rigor criteria such as sex and investigator blinding. For details on the theoretical underpinning of rigor criteria and the tools shown here, including references cited, please follow this link.