A graph-based approach identifies dynamic H-bond communication networks in spike protein S of SARS-CoV-2

This article has been Reviewed by the following groups

Read the full article See related articles

Discuss this preprint

Start a discussion What are Sciety discussions?

Abstract

Corona virus spike protein S is a large homo-trimeric protein embedded in the membrane of the virion particle. Protein S binds to angiotensin-converting-enzyme 2, ACE2, of the host cell, followed by proteolysis of the spike protein, drastic protein conformational change with exposure of the fusion peptide of the virus, and entry of the virion into the host cell. The structural elements that govern conformational plasticity of the spike protein are largely unknown. Here, we present a methodology that relies upon graph and centrality analyses, augmented by bioinformatics, to identify and characterize large H-bond clusters in protein structures. We apply this methodology to protein S ectodomain and find that, in the closed conformation, the three protomers of protein S bring the same contribution to an extensive central network of H-bonds, has a relatively large H-bond cluster at the receptor binding domain, and a cluster near a protease cleavage site. Markedly different H-bonding at these three clusters in open and pre-fusion conformations suggest dynamic H-bond clusters could facilitate structural plasticity and selection of a protein S protomer for binding to the host receptor, and proteolytic cleavage. From analyses of spike protein sequences we identify patches of histidine and carboxylate groups that could be involved in transient proton binding.

Article activity feed

  1. SciScore for 10.1101/2020.06.23.164947: (What is this?)

    Please note, not all rigor criteria are appropriate for all manuscripts.

    Table 1: Rigor

    NIH rigor criteria are not applicable to paper type.

    Table 2: Resources

    Software and Algorithms
    SentencesResources
    The resulting sequences were realigned with MAFFT using SARS-CoV-2 as reference.
    MAFFT
    suggested: (MAFFT, RRID:SCR_011811)
    Set-D contains human sequences of ACE2 from the 1000 human Genome project (Auton and Brooks, 2015); for this set, we used the Ensembl project (Hunt et al., 2018) and extracted the different protein haplotypes existing in the 1000 Genome Project for ACE2 using the GRCh38 human genome assembly as reference.
    Ensembl
    suggested: (Ensembl, RRID:SCR_002344)
    Computations of the electrostatic potential surface: were performed with the Adaptive Poisson Boltzmann Solver, APBS (Baker et al., 2001), in PyMol 2.0 (Schrödinger, 2015).
    PyMol
    suggested: (PyMOL, RRID:SCR_000305)
    As computations of average H-bond graphs require the same number of amino acid residues in the graphs to be averaged, where needed we used Modeller 9.21 (Marti-Renom et al., 2000) to construct coordinates for missing amino acid residues.
    Modeller
    suggested: (MODELLER, RRID:SCR_008395)

    Results from OddPub: We did not detect open data. We also did not detect open code. Researchers are encouraged to share open data when possible (see Nature blog).


    Results from LimitationRecognizer: An explicit section about the limitations of the techniques employed in this study was not found. We encourage authors to address study limitations.

    Results from TrialIdentifier: No clinical trial numbers were referenced.


    Results from Barzooka: We did not find any issues relating to the usage of bar graphs.


    Results from JetFighter: Please consider improving the rainbow (“jet”) colormap(s) used on pages 19, 20, 29 and 23. At least one figure is not accessible to readers with colorblindness and/or is not true to the data, i.e. not perceptually uniform.


    Results from rtransparent:
    • Thank you for including a conflict of interest statement. Authors are encouraged to include this statement when submitting to a journal.
    • No funding statement was detected.
    • No protocol registration statement was detected.

    About SciScore

    SciScore is an automated tool that is designed to assist expert reviewers by finding and presenting formulaic information scattered throughout a paper in a standard, easy to digest format. SciScore checks for the presence and correctness of RRIDs (research resource identifiers), and for rigor criteria such as sex and investigator blinding. For details on the theoretical underpinning of rigor criteria and the tools shown here, including references cited, please follow this link.