Rapid expansion of SARS-CoV-2 variants of concern is a result of adaptive epistasis

This article has been Reviewed by the following groups

Read the full article

Abstract

The SARS-CoV-2 pandemic recently entered an alarming new phase with the emergence of the variants of concern (VOC) and understanding their biology is paramount to predicting future ones. Current efforts mainly focus on mutations in the spike glycoprotein (S), but changes in other regions of the viral proteome are likely key. We analyzed more than 900,000 SARS-CoV-2 genomes with a computational systems biology approach including a haplotype network and protein structural analyses to reveal lineage-defining mutations and their critical functional attributes. Our results indicate that increased transmission is promoted by epistasis, i.e., combinations of mutations in S and other viral proteins. Mutations in the non-S proteins involve immune-antagonism and replication performance, suggesting convergent evolution. Furthermore, adaptive mutations appear in geographically disparate locations, suggesting that either independent, repeat mutation events or recombination among different strains are generating VOC. We demonstrate that recombination is a stronger hypothesis, and may be accelerating the emergence of VOC by bringing together cooperative mutations. This emphasizes the importance of a global response to stop the COVID-19 pandemic.

Article activity feed

  1. SciScore for 10.1101/2021.08.03.454981: (What is this?)

    Please note, not all rigor criteria are appropriate for all manuscripts.

    Table 1: Rigor

    NIH rigor criteria are not applicable to paper type.

    Table 2: Resources

    Software and Algorithms
    SentencesResources
    To ensure that deletions were accounted for, full genome sequences were aligned with MAFFT (Katoh et al. 2002) to the established reference genome (accession NC_045512) , uploaded into CLC Genomics Workbench, and trimmed to the start and stop codons (nsp1 start site and ORF10 stop codon).
    MAFFT
    suggested: (MAFFT, RRID:SCR_011811)
    The networks were exported as a table and visualized in Cytoscape (Shannon et al. 2003) with corresponding metadata.
    Cytoscape
    suggested: (Cytoscape, RRID:SCR_003032)
    Phylogenetic tree: We used the program MrBayes to generate a phylogenetic tree (Ronquist and Huelsenbeck 2003).
    MrBayes
    suggested: (MrBayes, RRID:SCR_012067)
    A consensus tree was generated using the 50% majority rule and visualized using FigTree v1.4.4 (http://tree.bio.ed.ac.uk/software/figtree/) Estimation of genome mutation load: We estimated the mutation load using two data sets.
    FigTree
    suggested: (FigTree, RRID:SCR_008515)
    Figure 3 was created using Inkscape (https://inskape.org/) and Gimp 2.8 (https://www.gimp.org) (Anon).
    Inkscape
    suggested: (Inkscape, RRID:SCR_014479)
    https://www.gimp.org
    suggested: (GNU Image Manipulation Program, RRID:SCR_003182)
    Three independent extensive MD simulations were performed for each species using GROMACS 2020 package (Lindahl et al. 2020) and the CHARMM36 force field for protein and glycans (Guvench et al. 2011; Huang and MacKerell 2013).
    GROMACS
    suggested: (GROMACS, RRID:SCR_014565)

    Results from OddPub: We did not detect open data. We also did not detect open code. Researchers are encouraged to share open data when possible (see Nature blog).


    Results from LimitationRecognizer: An explicit section about the limitations of the techniques employed in this study was not found. We encourage authors to address study limitations.

    Results from TrialIdentifier: No clinical trial numbers were referenced.


    Results from Barzooka: We did not find any issues relating to the usage of bar graphs.


    Results from JetFighter: Please consider improving the rainbow (“jet”) colormap(s) used on page 13. At least one figure is not accessible to readers with colorblindness and/or is not true to the data, i.e. not perceptually uniform.


    Results from rtransparent:
    • Thank you for including a conflict of interest statement. Authors are encouraged to include this statement when submitting to a journal.
    • Thank you for including a funding statement. Authors are encouraged to include this statement when submitting to a journal.
    • No protocol registration statement was detected.

    Results from scite Reference Check: We found no unreliable references.


    About SciScore

    SciScore is an automated tool that is designed to assist expert reviewers by finding and presenting formulaic information scattered throughout a paper in a standard, easy to digest format. SciScore checks for the presence and correctness of RRIDs (research resource identifiers), and for rigor criteria such as sex and investigator blinding. For details on the theoretical underpinning of rigor criteria and the tools shown here, including references cited, please follow this link.