SARS-CoV-2 convergent evolution as a guide to explore adaptive advantage

This article has been Reviewed by the following groups

Read the full article

Abstract

Much can be learned from 1.2 million sequences of SARS-CoV-2 generated during the last 15 months. Out of the overwhelming number of mutations sampled so far, only few rose to prominence in the viral population. Many of these emerged recently and independently in multiple lineages. Such a textbook example of convergent evolution at the molecular level is not only curiosity but a guide to uncover the basis for adaptive advantage behind these events. Focusing on the extent of the convergent evolution in the spike (S) protein, our report confirms that the most concerning SARS-CoV-2 lineages carry the heaviest burden of convergent S-protein mutations, suggesting their fundamental adaptive advantage. The great majority (21/25) of S-protein sites under convergent evolution tightly cluster in three functional domains; N-terminal domain, receptor-binding domain, and Furin cleavage site. We further show that among the S-protein receptor-binding motif mutations, ACE2 affinity-improving substitutions are favored. While the probed mutation space in the S protein covered all amino-acids reachable by single nucleotide changes, substitutions requiring two nucleotide changes or epistatic mutations of multiple-residues have only recently started to emerge. Unfortunately, despite their convergent emergence and physical association, most of these adaptive mutations and their combinations remain understudied. We aim to promote research of current variants which are currently understudied but may become important in the future.

Article activity feed

  1. SciScore for 10.1101/2021.05.24.445534: (What is this?)

    Please note, not all rigor criteria are appropriate for all manuscripts.

    Table 1: Rigor

    NIH rigor criteria are not applicable to paper type.

    Table 2: Resources

    Software and Algorithms
    SentencesResources
    The data manipulations as well as all SNC changes generation were done by in-house written Python 3.7 code.
    Python
    suggested: (IPython, RRID:SCR_001658)
    The missing structures in PDB 6zge were modeled by Modeller suite implemented in Chimera (28).
    Modeller
    suggested: (MODELLER, RRID:SCR_008395)

    Results from OddPub: We did not detect open data. We also did not detect open code. Researchers are encouraged to share open data when possible (see Nature blog).


    Results from LimitationRecognizer: An explicit section about the limitations of the techniques employed in this study was not found. We encourage authors to address study limitations.

    Results from TrialIdentifier: No clinical trial numbers were referenced.


    Results from Barzooka: We did not find any issues relating to the usage of bar graphs.


    Results from JetFighter: We did not find any issues relating to colormaps.


    Results from rtransparent:
    • Thank you for including a conflict of interest statement. Authors are encouraged to include this statement when submitting to a journal.
    • Thank you for including a funding statement. Authors are encouraged to include this statement when submitting to a journal.
    • No protocol registration statement was detected.

    Results from scite Reference Check: We found no unreliable references.


    About SciScore

    SciScore is an automated tool that is designed to assist expert reviewers by finding and presenting formulaic information scattered throughout a paper in a standard, easy to digest format. SciScore checks for the presence and correctness of RRIDs (research resource identifiers), and for rigor criteria such as sex and investigator blinding. For details on the theoretical underpinning of rigor criteria and the tools shown here, including references cited, please follow this link.