SARS-CoV-2 3CLpro whole human proteome cleavage prediction and enrichment/depletion analysis
This article has been Reviewed by the following groups
Discuss this preprint
Start a discussion What are Sciety discussions?Listed in
- Evaluated articles (ScreenIT)
Abstract
A novel coronavirus (SARS-CoV-2) has devastated the globe as a pandemic that has killed more than 1,600,000 people. Widespread vaccination is still uncertain, so many scientific efforts have been directed toward discovering antiviral treatments. Many drugs are being investigated to inhibit the coronavirus main protease, 3CLpro, from cleaving its viral polyprotein, but few publications have addressed this protease’s interactions with the host proteome or their probable contribution to virulence. Too few host protein cleavages have been experimentally verified to fully understand 3CLpro’s global effects on relevant cellular pathways and tissues. Here, I set out to determine this protease’s targets and corresponding potential drug targets. Using a neural network trained on cleavages from 388 coronavirus proteomes with a Matthews correlation coefficient of 0.983, I predict that a large proportion of the human proteome is vulnerable to 3CLpro, with 4,460 out of approximately 20,000 human proteins containing at least one putative cleavage site. These cleavages are nonrandomly distributed and are enriched in the epithelium along the respiratory tract, brain, testis, plasma, and immune tissues and depleted in olfactory and gustatory receptors despite the prevalence of anosmia and ageusia in COVID-19 patients. Affected cellular pathways include cytoskeleton/motor/cell adhesion proteins, nuclear condensation and other epigenetics, host transcription and RNAi, ribosomal stoichiometry and nascent-chain detection and degradation, coagulation, pattern recognition receptors, growth factors, lipoproteins, redox, ubiquitination, and apoptosis. This whole proteome cleavage prediction demonstrates the importance of 3CLpro in expected and nontrivial pathways affecting virulence, lead me to propose more than a dozen potential therapeutic targets against coronaviruses, and should therefore be applied to all viral proteases and subsequently experimentally verified.
Article activity feed
-
SciScore for 10.1101/2020.08.24.265645: (What is this?)
Please note, not all rigor criteria are appropriate for all manuscripts.
Table 1: Rigor
Institutional Review Board Statement not detected. Randomization not detected. Blinding not detected. Power Analysis not detected. Sex as a biological variable not detected. Table 2: Resources
Software and Algorithms Sentences Resources 70] Searching for “orf1ab,” “pp1ab,” and “1ab” within the family Coronaviridae returned 388 different, complete polyproteins with 762 different cleavages manually discovered using the Clustal Omega multiple sequence alignment server.[71–73] All 4,268 balanced positive cleavages were used for subsequent classifier training in addition to all other uncleaved coronavirus sequence windows centered at glutamines (17,493) and histidines (11,421), totaling 33,182 samples. Clustal Omegasu…SciScore for 10.1101/2020.08.24.265645: (What is this?)
Please note, not all rigor criteria are appropriate for all manuscripts.
Table 1: Rigor
Institutional Review Board Statement not detected. Randomization not detected. Blinding not detected. Power Analysis not detected. Sex as a biological variable not detected. Table 2: Resources
Software and Algorithms Sentences Resources 70] Searching for “orf1ab,” “pp1ab,” and “1ab” within the family Coronaviridae returned 388 different, complete polyproteins with 762 different cleavages manually discovered using the Clustal Omega multiple sequence alignment server.[71–73] All 4,268 balanced positive cleavages were used for subsequent classifier training in addition to all other uncleaved coronavirus sequence windows centered at glutamines (17,493) and histidines (11,421), totaling 33,182 samples. Clustal Omegasuggested: (Clustal Omega, RRID:SCR_001591)Enrichment Analysis: Protein annotation, classification, and enrichment analysis was performed using the Database for Annotation, Visualization, and Integrated Discovery (DAVID) 6.8.[75, 76] My training data, prediction methods, and results can be found on GitHub (https://github.com/Luke8472NN/NetProtease). DAVIDsuggested: (DAVID, RRID:SCR_001881)Results from OddPub: We did not detect open data. We also did not detect open code. Researchers are encouraged to share open data when possible (see Nature blog).
Results from LimitationRecognizer: An explicit section about the limitations of the techniques employed in this study was not found. We encourage authors to address study limitations.Results from TrialIdentifier: No clinical trial numbers were referenced.
Results from Barzooka: We did not find any issues relating to the usage of bar graphs.
Results from JetFighter: We did not find any issues relating to colormaps.
Results from rtransparent:- Thank you for including a conflict of interest statement. Authors are encouraged to include this statement when submitting to a journal.
- No funding statement was detected.
- No protocol registration statement was detected.
-
