Identification of a High-frequency Intra-host SARS-CoV-2 spike Variant with Enhanced Cytopathic and Fusogenic Effect

This article has been Reviewed by the following groups

Read the full article See related articles

Abstract

The severe acute respiratory syndrome coronavirus 2 (SARS-CoV-2) is a virus that is continuously evolving. Although its RNA-dependent RNA polymerase exhibits some exonuclease proofreading activity, viral sequence diversity can be produced by replication errors and host factors. A diversity of genetic variants can be observed in the intra-host viral population structure of infected individuals. Most mutations will follow a neutral molecular evolution and won’t make significant contributions to variations within and between infected hosts. Herein, we profiled the intra-sample genetic diversity of SARS-CoV-2 variants using high-throughput sequencing datasets from 15,289 infected individuals and infected cell lines. Most of the genetic variations observed, including C->U and G->U, were consistent with errors due to heat-induced DNA damage during sample processing and/or sequencing protocols. Despite high mutational background, we identified recurrent intra-variable positions in the samples analyzed, including several positions at the end of the gene encoding the viral Spike (S) protein. Strikingly, we observed a high-frequency C->A missense mutations resulting in the S protein lacking the last 20 amino acids (SΔ20). We found that this truncated S protein undergoes increased processing and increased syncytia formation, presumably due to escaping M protein retention in intracellular compartments. Our findings suggest the emergence of a high-frequency viral sublineage that is not horizontally transmitted but potentially involved in intra-host disease cytopathic effects.

IMPORTANCE

The mutation rate and evolution of RNA viruses correlate with viral adaptation. While most mutations do not have significant contributions to viral molecular evolution, some are naturally selected and cause a genetic drift through positive selection. Many recent SARS-CoV-2 variants have been recently described and show phenotypic selection towards more infectious viruses. Our study describes another type of variant that does not contribute to inter-host heterogeneity but rather phenotypic selection toward variants that might have increased cytopathic effects. We identified that a C-terminal truncation of the Spike protein removes an important ER-retention signal, which consequently results in a Spike variant that easily travels through the Golgi toward the plasma membrane in a pre-activated conformation, leading to increased syncytia formation.

Article activity feed

  1. SciScore for 10.1101/2020.12.03.409714: (What is this?)

    Please note, not all rigor criteria are appropriate for all manuscripts.

    Table 1: Rigor

    Institutional Review Board Statementnot detected.
    Randomizationnot detected.
    Blindingnot detected.
    Power Analysisnot detected.
    Sex as a biological variablenot detected.
    Cell Line Authenticationnot detected.

    Table 2: Resources

    Experimental Models: Cell Lines
    SentencesResources
    Syncytium Formation Assay: HEK293T expressing ACE2 cells were seeded in 24-well plates in complete media to obtain an 85% confluence the following day.
    HEK293T
    suggested: None
    Software and Algorithms
    SentencesResources
    Analysis of intra-variability within SARS-CoV-2 samples: 15,289 publicly available high-throughput sequencing datasets were downloaded from the NCBI Sequence Read Archive (up to July 10, 2020).
    NCBI Sequence Read Archive
    suggested: (NCBI Sequence Read Archive (SRA, RRID:SCR_004891)
    Duplicated reads were combined to reduce amplification bias and mapped to the SARS-CoV-2 isolate Wuhan-Hu-1 reference genome (NC_045512v2) using hisat2 (v.2.1.0)[47].
    hisat2
    suggested: (HISAT2, RRID:SCR_015530)
    For each dataset, the consensus sequences and the frequency of nucleotides at each position were extracted from files generated by bcftools (v.1.10.2) of the samtools package (v.1.1) with an in-house Perl script [48,49].
    samtools
    suggested: (SAMTOOLS, RRID:SCR_002105)
    Transcript abundance was performed using HTSeq 0.12.4 [51] and normalized into Transcripts Per Million (TPM) in R.
    HTSeq
    suggested: (HTSeq, RRID:SCR_005514)
    GFP area was quantified on ImageJ [54].
    ImageJ
    suggested: (ImageJ, RRID:SCR_003070)

    Results from OddPub: We did not detect open data. We also did not detect open code. Researchers are encouraged to share open data when possible (see Nature blog).


    Results from LimitationRecognizer: An explicit section about the limitations of the techniques employed in this study was not found. We encourage authors to address study limitations.

    Results from TrialIdentifier: No clinical trial numbers were referenced.


    Results from Barzooka: We did not find any issues relating to the usage of bar graphs.


    Results from JetFighter: We did not find any issues relating to colormaps.


    Results from rtransparent:
    • No conflict of interest statement was detected. If there are no conflicts, we encourage authors to explicit state so.
    • Thank you for including a funding statement. Authors are encouraged to include this statement when submitting to a journal.
    • No protocol registration statement was detected.

    About SciScore

    SciScore is an automated tool that is designed to assist expert reviewers by finding and presenting formulaic information scattered throughout a paper in a standard, easy to digest format. SciScore checks for the presence and correctness of RRIDs (research resource identifiers), and for rigor criteria such as sex and investigator blinding. For details on the theoretical underpinning of rigor criteria and the tools shown here, including references cited, please follow this link.