SARS‐CoV‐2, an evolutionary perspective of interaction with human ACE2 reveals undiscovered amino acids necessary for complex stability

This article has been Reviewed by the following groups

Read the full article

Abstract

The emergence of SARS‐CoV‐2 has resulted in nearly 1,280,000 infections and 73,000 deaths globally so far. This novel virus acquired the ability to infect human cells using the SARS‐CoV cell receptor hACE2. Because of this, it is essential to improve our understanding of the evolutionary dynamics surrounding the SARS‐CoV‐2 hACE2 interaction. One way theory predicts selection pressures should shape viral evolution is to enhance binding with host cells. We first assessed evolutionary dynamics in select betacoronavirus spike protein genes to predict whether these genomic regions are under directional or purifying selection between divergent viral lineages, at various scales of relatedness. With this analysis, we determine a region inside the receptor‐binding domain with putative sites under positive selection interspersed among highly conserved sites, which are implicated in structural stability of the viral spike protein and its union with human receptor ACE2. Next, to gain further insights into factors associated with recognition of the human host receptor, we performed modeling studies of five different betacoronaviruses and their potential binding to hACE2. Modeling results indicate that interfering with the salt bridges at hot spot 353 could be an effective strategy for inhibiting binding, and hence for the prevention of SARS‐CoV‐2 infections. We also propose that a glycine residue at the receptor‐binding domain of the spike glycoprotein can have a critical role in permitting bat SARS‐related coronaviruses to infect human cells.

Article activity feed

  1. SciScore for 10.1101/2020.03.21.001933: (What is this?)

    Please note, not all rigor criteria are appropriate for all manuscripts.

    Table 1: Rigor

    NIH rigor criteria are not applicable to paper type.

    Table 2: Resources

    Software and Algorithms
    SentencesResources
    Phylogenetic reconstruction, and analysis testing for evidence of positive/purifying selection in the coronavirus S-protein region: The most similar genomes to SARS-CoV-2 MN908947 were retrieved using BLASTp (Altschul et al., 1997) vs the NR database of GenBank (Table 1).
    BLASTp
    suggested: (BLASTP, RRID:SCR_001010)
    Genomes were then aligned using MAUVE (Darling et al., 2004) and the S-protein gene was trimmed.
    MAUVE
    suggested: (Mauve, RRID:SCR_012852)
    The phylogenetic reconstruction of S-protein genes was performed with PhyML (Guindon et al., 2010), using a GTR+I+G model, using 100 non-parametric bootstrap replicates.
    PhyML
    suggested: (PhyML, RRID:SCR_014629)
    We used the set of programs available in HyPhy (Kosakovsky Pond et al., 2020), Fast Unconstrained Bayesian
    HyPhy
    suggested: (HyPhy, RRID:SCR_016162)
    Homology models were built with Modeller v.
    Modeller
    suggested: (MODELLER, RRID:SCR_008395)

    Results from OddPub: We did not detect open data. We also did not detect open code. Researchers are encouraged to share open data when possible (see Nature blog).


    Results from LimitationRecognizer: We detected the following sentences addressing limitations in the study:
    This result does not disregard the presence of positive selection sites in SARS-CoV-2, nonetheless, it shows the limitation of the methods to identify with precision specific sites under positive selection in a precise taxon of a phylogenetic tree. We further warn researchers need to be conservative with interpretations of studies utilizing these methodologies, given the equivocal results can be generated by datasets varying in genetic similarity. To complement our analyses looking for evidence of selection among lineages, we specifically analyzed for patterns of selection across sites in the S-protein genes, we used the sites models available in CODEML and HyPhY. Model M2 of CODEML detected 0.133 % of sites under positive selection (ω>1) and models M1 and M2 detected 85% of sites under purifying selection (ω<1). Model M2 explains the significant data better (p=7e-4) than M1 model, that takes in account only sites with neutral and purifying selection. FUBAR of HyPhy also detected 1070 sites under purifying selection and only 2 sites under positive selection (alignment size 1284). A similar analysis performed by Benvenuto et al. (2020) with FUBAR detects 1065 sites under purifying selection and sites 536 and 644 as under positive selection. These results suggests by in large strong purifying selection is acting over the vast majority of the S-protein gene, with a comparatively low proportion of sites under positive selection. The evidence of a high amount of purifying selectio...

    Results from TrialIdentifier: No clinical trial numbers were referenced.


    Results from Barzooka: We did not find any issues relating to the usage of bar graphs.


    Results from JetFighter: We did not find any issues relating to colormaps.


    Results from rtransparent:
    • Thank you for including a conflict of interest statement. Authors are encouraged to include this statement when submitting to a journal.
    • No funding statement was detected.
    • No protocol registration statement was detected.

    About SciScore

    SciScore is an automated tool that is designed to assist expert reviewers by finding and presenting formulaic information scattered throughout a paper in a standard, easy to digest format. SciScore checks for the presence and correctness of RRIDs (research resource identifiers), and for rigor criteria such as sex and investigator blinding. For details on the theoretical underpinning of rigor criteria and the tools shown here, including references cited, please follow this link.