The evolutionary making of SARS-CoV-2

This article has been Reviewed by the following groups

Read the full article

Abstract

A mechanistic understanding of how SARS-CoV-2 (sarbecovirus, betacoronavirus) infects human cells is emerging, but the evolutionary trajectory that gave rise to this pathogen is poorly understood. Here we scan SARS-CoV-2 protein sequences in-silico for innovations along the evolutionary lineage starting with the last common ancestor of coronaviruses. SARS-CoV-2 substantially differs from viruses outside sarbecovirus both in its set of encoded proteins and in their domain architectures, indicating divergent functional demands. Within sarbecoviruses, sub-domain level profiling using predicted linear epitopes reveals how the primary interface between host cell and virus, the spike, was gradually reshaped. The only epitope that is private to SARS-CoV-2 overlaps with the furin cleavage site, a “switch” that modulates spike’s conformational landscape in response to host-cell interaction. This cleavage site has fundamental relevance for both immune evasion and cell infection, and the apparently ongoing evolutionary fine-tuning of its use by SARS-CoV-2 should be monitored.

Article activity feed

  1. SciScore for 10.1101/2021.01.29.428808: (What is this?)

    Please note, not all rigor criteria are appropriate for all manuscripts.

    Table 1: Rigor

    NIH rigor criteria are not applicable to paper type.

    Table 2: Resources

    Software and Algorithms
    SentencesResources
    Data collection: A collection of 279 virus genomes covering the currently known diversity of coronaviridae was downloaded from the RefSeq, GenBank, ViPR, and GISAID databases (Shu and McCauley 2017).
    ViPR
    suggested: (vipR, RRID:SCR_010685)
    (RefSeq Assembly accession GCF_009858895.2) as the seed.
    RefSeq
    suggested: (RefSeq, RRID:SCR_003496)
    Linear protein features were annotated in the following way: Pfam (Pfam v.
    Pfam
    suggested: (Pfam, RRID:SCR_004726)
    33.1) (El-Gebali, et al. 2019) and SMART (Letunic, et al. 2009) domains were annotated with hmmsearch from the HMMER package (Finn, et al. 2015), low complexity regions were predicted with flps (Harrison 2017) and SEG (Wootton and Federhen 1993), signal peptides with SignalP (Petersen, et al. 2011), transmembrane domains with tmhmm (Sonnhammer, et al. 1998), and coiled-coil conformations with COILS2 (Lupas, et al. 1991).
    SMART
    suggested: (SMART, RRID:SCR_005026)
    HMMER
    suggested: (Hmmer, RRID:SCR_005305)
    SignalP
    suggested: (SignalP, RRID:SCR_015644)
    ML trees for the individual alignments were computed with RAxML (Stamatakis 2014), allowing the software to automatically select the best fitting substitution model (Option PROTGAMMAAUTO).
    RAxML
    suggested: (RAxML, RRID:SCR_006086)

    Results from OddPub: Thank you for sharing your data.


    Results from LimitationRecognizer: We detected the following sentences addressing limitations in the study:
    Our approach has one natural limitation, that is shared with experimental epitope profiling studies (Ng, et al. 2020; Shrock, et al. 2020). Structural epitopes that emerge from the spatial proximity of non-adjacent peptides in the amino-acid sequence in the folded protein are not considered. It is for this reason that we miss a shared epitope between SARS-CoV-1 and SARS-CoV-2 that is recognized by the same antibody (Pinto, et al. 2020). Although the restriction to linear epitopes comes at the cost of a reduced sensitivity, it renders our approach scalable and independent of the availability of accurate 3D-models of the proteins under study. It thus can be rapidly applied to any novel variant of SARS-CoV2, as well as on any newly emerging human pathogen. Epitope function balances immunodominance: The interaction of the virus with its host requires that the pathogen exposes parts of the spike epitopes—most prominently the RBD and the RBM therein—to establish contact with the ACE2 (Lan, et al. 2020; Walls, et al. 2020). The resulting epitopes should be highly immunogenic, and their evolutionary emergence and subsequent fate is determined by a trade-off between their function, and the counter-selective pressure imposed by the host immune system. For LE9, representing the furin cleavage site, and LE6, which harbors four contact residues to ACE2, the balance between risk and payoff of the epitope is clearly in favor of the latter. Both are critical for the infection process (Hoffma...

    Results from TrialIdentifier: No clinical trial numbers were referenced.


    Results from Barzooka: We did not find any issues relating to the usage of bar graphs.


    Results from JetFighter: We did not find any issues relating to colormaps.


    Results from rtransparent:
    • Thank you for including a conflict of interest statement. Authors are encouraged to include this statement when submitting to a journal.
    • Thank you for including a funding statement. Authors are encouraged to include this statement when submitting to a journal.
    • No protocol registration statement was detected.

    About SciScore

    SciScore is an automated tool that is designed to assist expert reviewers by finding and presenting formulaic information scattered throughout a paper in a standard, easy to digest format. SciScore checks for the presence and correctness of RRIDs (research resource identifiers), and for rigor criteria such as sex and investigator blinding. For details on the theoretical underpinning of rigor criteria and the tools shown here, including references cited, please follow this link.