Detailed phylogenetic analysis of SARS-CoV-2 reveals latent capacity to bind human ACE2 receptor

This article has been Reviewed by the following groups

Read the full article

Abstract

SARS-CoV-2 is a unique event, having emerged suddenly as a highly infectious viral pathogen for human populations. Previous phylogenetic analyses show its closest known evolutionary relative to be a virus detected in bats (RaTG13), with a common assumption that SARS-CoV-2 evolved from a zoonotic ancestor via recent genetic changes (likely in the Spike protein receptor binding domain – or RBD) that enabled it to infect humans. We used detailed phylogenetic analysis, ancestral sequence reconstruction, and in situ molecular dynamics simulations to examine the Spike-RBD’s functional evolution, finding that the common ancestral virus with RaTG13, dating to at least 2013, possessed high binding affinity to the human ACE2 receptor. This suggests that SARS-CoV-2 likely possessed a latent capacity to bind to human cellular targets (though this may not have been sufficient for successful infection) and emphasizes the importance to expand the cataloging and monitoring of viruses circulating in both human and non-human populations.

Article activity feed

  1. SciScore for 10.1101/2020.06.22.165787: (What is this?)

    Please note, not all rigor criteria are appropriate for all manuscripts.

    Table 1: Rigor

    NIH rigor criteria are not applicable to paper type.

    Table 2: Resources

    Software and Algorithms
    SentencesResources
    Selected sequences were aligned using the Multiple Alignment using Fast Fourier Transform Version 7 (MAFFT) FFT-NS-2 algorithm. 52,53 MAFFT default parameters were used in our alignment, meaning gap penalties were assigned a value of 1.53.
    MAFFT
    suggested: (MAFFT, RRID:SCR_011811)
    PhyML 3.0 was employed to construct a phylogeny of aligned genomes.
    PhyML
    suggested: (PhyML, RRID:SCR_014629)
    51 Protein sequences were initially aligned using the Multiple Sequence Alignment by Log-Expectation
    Log-Expectation
    suggested: None
    (MUSCLE) program.60 The optimal parameters for phylogenetic reconstruction analysis were taken from the best-fit evolutionary model selected using the Akaike Information Criterion (AIC) implemented in the PROTTEST3 software,61 and were inferred to be the Jones-Taylor-Thornton (JTT) model62 with gamma-distributed among-site rate variation and empirical state frequencies.
    MUSCLE
    suggested: (MUSCLE, RRID:SCR_011812)
    Phylogeny was inferred from these alignments using the RaXML v8.2.9 software63 and results were visualized using FigTree v1.4.4 (https://github.com/rambaut/figtree/releases).
    RaXML
    suggested: (RAxML, RRID:SCR_006086)
    FigTree
    suggested: (FigTree, RRID:SCR_008515)
    Ancestral sequence reconstruction was performed with the FastML software64 and further validated independently using the Graphical Representation of Ancestral Sequence Predictions (GRASP) software.65 Statistical confidence in each position’s reconstructed state for each ancestor determined from posterior probability; any reconstructed positions with less than 95% posterior probability was considered ambiguous, and alternate states were also tested.
    FastML
    suggested: (Fastml, RRID:SCR_016092)
    Utilizing PyMOL mutagenesis wizard, 66 the four missense mutations (R346t, A372t, Q498h or Q498y, H519n) identified between the N0 and N1 sequences were introduced into the SARS-CoV-2 RBD sequence, replicating the sequence of the putative ancestral zoonotic (N0) sequence.
    PyMOL
    suggested: (PyMOL, RRID:SCR_000305)
    We used TIP3P waters and the CHARM07 FF03 parameters for proteins, as implemented in GROMACS 4.5.5.67 Analyses were performed using VMD 1.9.1.68 GROMACS output was uploaded into Visual Molecular Dynamics (VMD) for Root-Mean Squared Deviation (RMSD) Analysis using the RMSD trajectory tool (ref).
    GROMACS
    suggested: (GROMACS, RRID:SCR_014565)

    Results from OddPub: We did not detect open data. We also did not detect open code. Researchers are encouraged to share open data when possible (see Nature blog).


    Results from LimitationRecognizer: An explicit section about the limitations of the techniques employed in this study was not found. We encourage authors to address study limitations.

    Results from TrialIdentifier: No clinical trial numbers were referenced.


    Results from Barzooka: We did not find any issues relating to the usage of bar graphs.


    Results from JetFighter: We did not find any issues relating to colormaps.


    Results from rtransparent:
    • Thank you for including a conflict of interest statement. Authors are encouraged to include this statement when submitting to a journal.
    • No funding statement was detected.
    • No protocol registration statement was detected.

    About SciScore

    SciScore is an automated tool that is designed to assist expert reviewers by finding and presenting formulaic information scattered throughout a paper in a standard, easy to digest format. SciScore checks for the presence and correctness of RRIDs (research resource identifiers), and for rigor criteria such as sex and investigator blinding. For details on the theoretical underpinning of rigor criteria and the tools shown here, including references cited, please follow this link.