Predicted coronavirus Nsp5 protease cleavage sites in the human proteome

This article has been Reviewed by the following groups

Read the full article See related articles

Abstract

Background

The coronavirus nonstructural protein 5 (Nsp5) is a cysteine protease required for processing the viral polyprotein and is therefore crucial for viral replication. Nsp5 from several coronaviruses have also been found to cleave host proteins, disrupting molecular pathways involved in innate immunity. Nsp5 from the recently emerged SARS-CoV-2 virus interacts with and can cleave human proteins, which may be relevant to the pathogenesis of COVID-19. Based on the continuing global pandemic, and emerging understanding of coronavirus Nsp5-human protein interactions, we set out to predict what human proteins are cleaved by the coronavirus Nsp5 protease using a bioinformatics approach.

Results

Using a previously developed neural network trained on coronavirus Nsp5 cleavage sites (NetCorona), we made predictions of Nsp5 cleavage sites in all human proteins. Structures of human proteins in the Protein Data Bank containing a predicted Nsp5 cleavage site were then examined, generating a list of 92 human proteins with a highly predicted and accessible cleavage site. Of those, 48 are expected to be found in the same cellular compartment as Nsp5. Analysis of this targeted list of proteins revealed molecular pathways susceptible to Nsp5 cleavage and therefore relevant to coronavirus infection, including pathways involved in mRNA processing, cytokine response, cytoskeleton organization, and apoptosis.

Conclusions

This study combines predictions of Nsp5 cleavage sites in human proteins with protein structure information and protein network analysis. We predicted cleavage sites in proteins recently shown to be cleaved in vitro by SARS-CoV-2 Nsp5, and we discuss how other potentially cleaved proteins may be relevant to coronavirus mediated immune dysregulation. The data presented here will assist in the design of more targeted experiments, to determine the role of coronavirus Nsp5 cleavage of host proteins, which is relevant to understanding the molecular pathology of coronavirus infection.

Article activity feed

  1. SciScore for 10.1101/2021.06.08.447224: (What is this?)

    Please note, not all rigor criteria are appropriate for all manuscripts.

    Table 1: Rigor

    NIH rigor criteria are not applicable to paper type.

    Table 2: Resources

    Software and Algorithms
    SentencesResources
    To overcome the input file limit of 50,000 amino acids per submission and handle sequences with non-standard amino acids, a Python script was developed.
    Python
    suggested: (IPython, RRID:SCR_001658)
    Statistical analysis and the generation of graphs was performed using GraphPad Prism (version 9.1.0) Structural Analysis: PDB metadata associated with proteins in the “Proteins With PDB” dataset that also contained a predicted Nsp5 cleavage (NetCorona score >0.5), were downloaded from the RCSB PDB website by generating a custom report in .csv format.
    GraphPad Prism
    suggested: (GraphPad Prism, RRID:SCR_002798)
    Nsp5 cleavage sites predicted by NetCorona were matched with one PDB file per cleavage site, by searching the PDB metadata for the predicted 9 amino acid cleavage motif using Microsoft Excel (Additional File 6).
    Microsoft Excel
    suggested: (Microsoft Excel, RRID:SCR_016137)
    Publication quality figures were generated using PyMOL 2.3.0
    PyMOL
    suggested: (PyMOL, RRID:SCR_000305)
    The node table (including tissue expression scores and compartments score for each protein) was exported to R for wrangling and data visualization using the tidyverse and ggrepel packages [94–96].
    ggrepel
    suggested: (ggrepel, RRID:SCR_017393)
    Protein Network Analysis: The 48 proteins with a Nsp5 access score >500 and that had the potential to be found in the same cellular compartment as Nsp5 were imported into the STRING app (again within Cytoscape) while allowing a maximum of 5 additional interactor for the network generation instead of none.
    STRING
    suggested: (STRING, RRID:SCR_005223)
    Cytoscape
    suggested: (Cytoscape, RRID:SCR_003032)

    Results from OddPub: Thank you for sharing your data.


    Results from LimitationRecognizer: We detected the following sentences addressing limitations in the study:
    N-terminomics based approaches have identified many potential Nsp5 cleavage sites in human proteins [47, 48], but they have some limitations that bioinformatics can compliment. Trypsin is used in the preparation of samples for mass spectrometry, which generates cleavages at lysine and arginine residues that are not N-terminal to a proline. Lysine and arginine appear in many cleavage sites predicted by NetCorona, meaning that cleavage by trypsin may mask true cleavage sites by artificially generating a N-terminus proximal to a P1 glutamine residue. Only one protein overlaps between the Koudelka et al. and Meyer et al. results, as these studies used different cell lines, and thus different proteins will be expressed, and the methods of exposure to Nsp5 also differed (cell lysate incubated with Nsp5 vs SARS-CoV-2 infection of cells) [47, 48]. Meyer et al. point out that the lysate-based method used by Koudelka et al. strips proteins of their subcellular context, which may lead to observed cleavage events that are not possible in vivo during infection [48]. In contrast, our bioinformatics analysis is cell-type and methodology agnostic as it examined the entire human proteome. The cleavage sites predicted in silico, combined with knowledge of Nsp5 subcellular localization and protein networks, identified several interesting human proteins and pathways. DHX15 contained a predicted cleavage site with the highest Nsp5 access score, and the protein may co-localize with Nsp5 in the nuc...

    Results from TrialIdentifier: No clinical trial numbers were referenced.


    Results from Barzooka: We did not find any issues relating to the usage of bar graphs.


    Results from JetFighter: We did not find any issues relating to colormaps.


    Results from rtransparent:
    • Thank you for including a conflict of interest statement. Authors are encouraged to include this statement when submitting to a journal.
    • Thank you for including a funding statement. Authors are encouraged to include this statement when submitting to a journal.
    • No protocol registration statement was detected.

    Results from scite Reference Check: We found no unreliable references.


    About SciScore

    SciScore is an automated tool that is designed to assist expert reviewers by finding and presenting formulaic information scattered throughout a paper in a standard, easy to digest format. SciScore checks for the presence and correctness of RRIDs (research resource identifiers), and for rigor criteria such as sex and investigator blinding. For details on the theoretical underpinning of rigor criteria and the tools shown here, including references cited, please follow this link.