The unique evolutionary dynamics of the SARS-CoV-2 Delta variant

This article has been Reviewed by the following groups

Read the full article See related articles

Abstract

The SARS-Coronavirus-2 (SARS-CoV-2) driven pandemic was first recognized in late 2019, and the first few months of its evolution were relatively clock-like, dominated mostly by neutral substitutions. In contrast, the second year of the pandemic was punctuated by the emergence of several variants that bore evidence of dramatic evolution. Here, we compare and contrast evolutionary patterns of various variants, with a focus on the recent Delta variant. Most variants are characterized by long branches leading to their emergence, with an excess of non-synonymous substitutions occurring particularly in the Spike and Nucleocapsid proteins. In contrast, the Delta variant that is now becoming globally dominant, lacks the signature long branch, and is characterized by a step-wise evolutionary process that is ongoing. Contrary to the “star-like” topologies of other variants, we note the formation of several distinct clades within Delta that we denote as clades A-E. We find that sequences from the Delta D clade are dramatically increasing in frequency across different regions of the globe. Delta D is characterized by an excess of non-synonymous mutations, mostly occurring in ORF1a/b, some of which occurred in parallel in other notable variants. We conclude that the Delta surge these days is composed almost exclusively of Delta D, and discuss whether selection or random genetic drift has driven the emergence of Delta D.

Article activity feed

  1. SciScore for 10.1101/2021.08.05.21261642: (What is this?)

    Please note, not all rigor criteria are appropriate for all manuscripts.

    Table 1: Rigor

    EthicsIRB: Ethics statement: The study was conducted according to the guidelines of the Declaration of Helsinki, and approved by the Institutional Review Board of the Sheba Medical Center institutional review board (7045-20-SMC).
    Consent: For the Israel cohort, patient consent was waived as the study used remains of clinical samples and the analysis used anonymous clinical data.
    Sex as a biological variablenot detected.
    RandomizationOf these, twelve lineages were randomly chosen with an emphasis on lineages prevalent across different continents.
    Blindingnot detected.
    Power Analysisnot detected.

    Table 2: Resources

    Software and Algorithms
    SentencesResources
    Construction of phylogenetic trees: Phylogenetic trees were constructed using NextStrain’s Augur pipeline [8].
    Augur
    suggested: None
    Dating of internal nodes are reported based on NextStrain, which in turns relies on IQ-Tree [25] ancestral sequence reconstruction, dating, and assignment of confidence intervals for these dates.
    IQ-Tree
    suggested: (IQ-TREE, RRID:SCR_017254)
    Fastq files were subjected to quality control using FastQC (www.bioinformatics.babraham.ac.uk/projects/fastqc/) and MultiQC [33] and low-quality sequences were filtered using trimmomatic [34].
    FastQC
    suggested: (FastQC, RRID:SCR_014583)
    MultiQC
    suggested: (MultiQC, RRID:SCR_014982)
    trimmomatic
    suggested: (Trimmomatic, RRID:SCR_011848)
    Sequences were mapped to the SARS-CoV-2 reference genome (NC_045512.2) with Burrows-Wheeler aligner (BWA) mem [35].
    BWA
    suggested: (BWA, RRID:SCR_010910)
    Resulting BAM files were sorted and indexed using SAMtools suite [36].
    SAMtools
    suggested: (SAMTOOLS, RRID:SCR_002105)
    Multiple alignment of sample sequences with the reference Wuhan sequence (NC_045512.2) was performed with MAFFT using default parameters [24].
    MAFFT
    suggested: (MAFFT, RRID:SCR_011811)

    Results from OddPub: We did not detect open data. We also did not detect open code. Researchers are encouraged to share open data when possible (see Nature blog).


    Results from LimitationRecognizer: We detected the following sentences addressing limitations in the study:
    The caveat in this hypothesis is that infections from all Delta clades are evident already in April across the globe. It is possible that these are a biased sample from incoming travelers who did not go on to create transmission chains. A second hypothesis is that Delta A may be under positive selection. In line with this, 82% of the lineage specific mutations of this clade are non-synonymous, similar to what characterizes VOs (Table S2, Fig. 2A). However, this clade lacks additional substitutions in its Spike gene (as opposed to clades A and E), and is characterized by seven amino-acid replacements in the ORF1a/b polyprotein. This is particularly perplexing as the lineage defining mutations of the main Delta lineage are depleted of mutations in ORF1a/b (Table S2). An additional non-synonymous substitution is evident in the ORF7b gene (T40I) and in the N gene (G215C) at position 215, quite proximal to the 203-205 region discussed above. Only two of the eleven substitutions unique to the Delta D clade are seen in other clades worldwide (Table S2), and thus the functional implications of Delta D substitutions remain to be further investigated. To summarize, we have used a comparative approach to detect a unique mode of evolution present in Delta. This step-wise mode of evolution characterized both the formation of the Delta D clade, and its subsequent spread, and is in stark contrast to the evolution observed in other VOCs. In particular, the global increase in Delta frequency ...

    Results from TrialIdentifier: No clinical trial numbers were referenced.


    Results from Barzooka: We did not find any issues relating to the usage of bar graphs.


    Results from JetFighter: We did not find any issues relating to colormaps.


    Results from rtransparent:
    • Thank you for including a conflict of interest statement. Authors are encouraged to include this statement when submitting to a journal.
    • Thank you for including a funding statement. Authors are encouraged to include this statement when submitting to a journal.
    • No protocol registration statement was detected.

    Results from scite Reference Check: We found no unreliable references.


    About SciScore

    SciScore is an automated tool that is designed to assist expert reviewers by finding and presenting formulaic information scattered throughout a paper in a standard, easy to digest format. SciScore checks for the presence and correctness of RRIDs (research resource identifiers), and for rigor criteria such as sex and investigator blinding. For details on the theoretical underpinning of rigor criteria and the tools shown here, including references cited, please follow this link.