Integrative COVID-19 biological network inference with probabilistic core decomposition
This article has been Reviewed by the following groups
Listed in
- Evaluated articles (ScreenIT)
Abstract
The severe acute respiratory syndrome coronavirus 2 (SARS-CoV-2) is responsible for millions of deaths around the world. To help contribute to the understanding of crucial knowledge and to further generate new hypotheses relevant to SARS-CoV-2 and human protein interactions, we make use of the information abundant Biomine probabilistic database and extend the experimentally identified SARS-CoV-2-human protein–protein interaction (PPI) network in silico. We generate an extended network by integrating information from the Biomine database, the PPI network and other experimentally validated results. To generate novel hypotheses, we focus on the high-connectivity sub-communities that overlap most with the integrated experimentally validated results in the extended network. Therefore, we propose a new data analysis pipeline that can efficiently compute core decomposition on the extended network and identify dense subgraphs. We then evaluate the identified dense subgraph and the generated hypotheses in three contexts: literature validation for uncovered virus targeting genes and proteins, gene function enrichment analysis on subgraphs and literature support on drug repurposing for identified tissues and diseases related to COVID-19. The major types of the generated hypotheses are proteins with their encoding genes and we rank them by sorting their connections to the integrated experimentally validated nodes. In addition, we compile a comprehensive list of novel genes, and proteins potentially related to COVID-19, as well as novel diseases which might be comorbidities. Together with the generated hypotheses, our results provide novel knowledge relevant to COVID-19 for further validation.
Article activity feed
-
-
-
SciScore for 10.1101/2021.06.23.449535: (What is this?)
Please note, not all rigor criteria are appropriate for all manuscripts.
Table 1: Rigor
NIH rigor criteria are not applicable to paper type.Table 2: Resources
Software and Algorithms Sentences Resources 2.1 Biomine database and SARS-CoV-2-host protein-protein interaction network: The Biomine database is a large probabilistic biological network constructed using selected publicly available databases, for example, Entrez Gene, UniProt, STRING, InterPro, PubMed Entrez Genesuggested: (Entrez Gene, RRID:SCR_002473)STRINGsuggested: (STRING, RRID:SCR_005223)InterProsuggested: (InterPro, RRID:SCR_006695)PubMedsuggested: (PubMed, RRID:SCR_004846)There are many nodes in Biomine that can be directly connected to the PPI network, which are potentially more useful than other nodes. Biominesuggested: …SciScore for 10.1101/2021.06.23.449535: (What is this?)
Please note, not all rigor criteria are appropriate for all manuscripts.
Table 1: Rigor
NIH rigor criteria are not applicable to paper type.Table 2: Resources
Software and Algorithms Sentences Resources 2.1 Biomine database and SARS-CoV-2-host protein-protein interaction network: The Biomine database is a large probabilistic biological network constructed using selected publicly available databases, for example, Entrez Gene, UniProt, STRING, InterPro, PubMed Entrez Genesuggested: (Entrez Gene, RRID:SCR_002473)STRINGsuggested: (STRING, RRID:SCR_005223)InterProsuggested: (InterPro, RRID:SCR_006695)PubMedsuggested: (PubMed, RRID:SCR_004846)There are many nodes in Biomine that can be directly connected to the PPI network, which are potentially more useful than other nodes. Biominesuggested: (Biomine, RRID:SCR_003552)To evaluate functional pathways of proteins involved in SARS-CoV-2 host interactions from the core decomposition result of PA, gene enrichment analysis was performed using clusterProfiler [22] and Metascape [23]. clusterProfilersuggested: (clusterProfiler, RRID:SCR_016884)Metascapesuggested: (Metascape, RRID:SCR_016620)Since Metascape and DAVID both restrict input gene list size up to 3000, if our list exceeds that number, we will select the top 3000 nodes based on their connections to the original PPI network nodes. DAVIDsuggested: (DAVID, RRID:SCR_001881)Results from OddPub: Thank you for sharing your code.
Results from LimitationRecognizer: An explicit section about the limitations of the techniques employed in this study was not found. We encourage authors to address study limitations.Results from TrialIdentifier: No clinical trial numbers were referenced.
Results from Barzooka: We did not find any issues relating to the usage of bar graphs.
Results from JetFighter: We did not find any issues relating to colormaps.
Results from rtransparent:- Thank you for including a conflict of interest statement. Authors are encouraged to include this statement when submitting to a journal.
- Thank you for including a funding statement. Authors are encouraged to include this statement when submitting to a journal.
- No protocol registration statement was detected.
Results from scite Reference Check: We found no unreliable references.
-