Differential gene expression profiling reveals potential biomarkers and pharmacological compounds against SARS-CoV-2: insights from machine learning and bioinformatics approaches

This article has been Reviewed by the following groups

Read the full article See related articles

Abstract

SARS-CoV-2 continues to spread and evolve worldwide, despite intense efforts to develop multiple vaccines and therapeutic options against COVID-19. Moreover, the precise role of SARS-CoV-2 in the pathophysiology of the nasopharyngeal tract (NT) is still unfathomable. Therefore, we used the machine learning methods to analyze 22 RNA-seq datasets from COVID-19 patients (n=8), recovered individuals (n=7), and healthy individuals (n=7) to find disease-related differentially expressed genes (DEGs). In comparison to healthy controls, we found 1960 and 153 DEG signatures in COVID-19 patients and recovered individuals, respectively. We compared dysregulated DEGs to detect critical pathways and gene ontology (GO) connected to COVID-19 comorbidities. In COVID-19 patients, the DEG– miRNA and DEG–transcription factors (TFs) interactions network analysis revealed that E2F1, MAX, EGR1, YY1, and SRF were the most highly expressed TFs, whereas hsa-miR-19b, hsa-miR-495, hsa-miR-340, hsa-miR-101, and hsa-miR-19a were the overexpressed miRNAs. Three chemical agents (Valproic Acid, Alfatoxin B1, and Cyclosporine) were abundant in COVID-19 patients and recovered individuals. Mental retardation, mental deficit, intellectual disability, muscle hypotonia, micrognathism, and cleft palate were the significant diseases associated with COVID-19 by sharing DEGs. Finally, we detected DEGs impacted by SARS-CoV-2 infection and mediated by TFs and miRNA expression, indicating that SARS-CoV-2 infection may contribute to various comorbidities. These pathogenetic findings can provide some crucial insights into the complex interplay between COVID-19 and the recovery stage and support its importance in the therapeutic development strategy to combat against COVID-19 pandemic.

IMPORTANCE

Despite it has now been over two years since the beginning of the COVID-19 pandemic, many crucial questions about SARS-CoV-2 infection and the different COVID-19 symptoms it causes remain unresolved. An intriguing question about COVID-19 is how SARS-CoV-2 interplays with the host during infection and how SARS-CoV-2 infection can cause so many disease symptoms. Our analysis of three different datasets (COVID-19, recovered, and healthy) revealed significantly higher DEGs in COVID-19 patients than recovered humans and healthy controls. Some of these DEGs were found to be co-expressed in both COVID-19 patients. They recovered humans supporting the notion that DEGs level is directly correlated with the viral load, disease progression, and different comorbidities. The protein-protein interaction consisting of 24 nodes and 72 edges recognized eight hub-nodes as potential hub-proteins (i.e., RPL4, RPS4X, RPL19, RPS12, RPL19, EIF3E, MT-CYB, and MT-ATP6). Protein–chemical interaction analysis identified three chemical agents (e.g., Valproic Acid, Alfatoxin B1, and Cyclosporine) enriched in COVID-19 patients and recovered individuals. Mental retardation, mental deficiency, intellectual disability, muscle hypotonia, micrognathism, and cleft palate were the significant diseases associated with COVID-19 by sharing DEGs.

Article activity feed

  1. SciScore for 10.1101/2022.03.30.486356: (What is this?)

    Please note, not all rigor criteria are appropriate for all manuscripts.

    Table 1: Rigor

    NIH rigor criteria are not applicable to paper type.

    Table 2: Resources

    Software and Algorithms
    SentencesResources
    The BioJupies generator online server (https://maayanlab.cloud/biojupies/) was used for RNA-seq raw data analysis [1].
    BioJupies
    suggested: (BioJupies, RRID:SCR_016346)
    Functional enrichment analysis: We utilized Enrichr [3] with Fisher’s exact test to conduct the functional enrichment analysis with the combined DEGs.
    Enrichr
    suggested: (Enrichr, RRID:SCR_001575)
    In Enrichr analysis, we combined the signaling pathways from two libraries, including KEGG and Reactome, to create a single route.
    KEGG
    suggested: (KEGG, RRID:SCR_012773)
    Protein-protein interaction network analysis: The shared DEGs’ protein-protein interaction (PPI) was analyzed using the STRING database [4].
    STRING
    suggested: (STRING, RRID:SCR_005223)
    We applied different local- and global-based methods using the cytoHubba plugin [5] in Cytoscape v3.8.2 [6] to determine potential hubs proteins within the PPI network.
    cytoHubba
    suggested: (cytoHubba, RRID:SCR_017677)
    Finally, the protein networks were analyzed through Cytoscape v3.8.2.
    Cytoscape
    suggested: (Cytoscape, RRID:SCR_003032)
    Using the shared DEGs, we constructed the protein-drug interaction (PDI) network through the NetworkAnalyst v3.0 web server [7] in conjunction with the DrugBank v5.0 database (https://go.drugbank.com/docs/drugbank_v5.0.xsd).
    NetworkAnalyst
    suggested: (NetworkAnalyst, RRID:SCR_016909)
    DrugBank
    suggested: (DrugBank, RRID:SCR_002700)

    Results from OddPub: Thank you for sharing your data.


    Results from LimitationRecognizer: An explicit section about the limitations of the techniques employed in this study was not found. We encourage authors to address study limitations.

    Results from TrialIdentifier: No clinical trial numbers were referenced.


    Results from Barzooka: We did not find any issues relating to the usage of bar graphs.


    Results from JetFighter: We did not find any issues relating to colormaps.


    Results from rtransparent:
    • Thank you for including a conflict of interest statement. Authors are encouraged to include this statement when submitting to a journal.
    • No funding statement was detected.
    • No protocol registration statement was detected.

    Results from scite Reference Check: We found no unreliable references.


    About SciScore

    SciScore is an automated tool that is designed to assist expert reviewers by finding and presenting formulaic information scattered throughout a paper in a standard, easy to digest format. SciScore checks for the presence and correctness of RRIDs (research resource identifiers), and for rigor criteria such as sex and investigator blinding. For details on the theoretical underpinning of rigor criteria and the tools shown here, including references cited, please follow this link.