Comorbidities and Susceptibility to COVID-19: A Generalized Gene Set Data Mining Approach

This article has been Reviewed by the following groups

Read the full article See related articles

Abstract

The COVID-19 pandemic has led to over 2.26 million deaths for almost 104 million confirmed cases worldwide, as of 4 February 2021 (WHO). Risk factors include pre-existing conditions such as cancer, cardiovascular disease, diabetes, and obesity. Although several vaccines have been deployed, there are few alternative anti-viral treatments available in the case of reduced or non-existent vaccine protection. Adopting a long-term holistic approach to cope with the COVID-19 pandemic appears critical with the emergence of novel and more infectious SARS-CoV-2 variants. Our objective was to identify comorbidity-associated single nucleotide polymorphisms (SNPs), potentially conferring increased susceptibility to SARS-CoV-2 infection using a computational meta-analysis approach. SNP datasets were downloaded from a publicly available genome-wide association studies (GWAS) catalog for 141 of 258 candidate COVID-19 comorbidities. Gene-level SNP analysis was performed to identify significant pathways by using the program MAGMA. An SNP annotation program was used to analyze MAGMA-identified genes. Differential gene expression was determined for significant genes across 30 general tissue types using the Functional and Annotation Mapping of GWAS online tool GENE2FUNC. COVID-19 comorbidities (n = 22) from six disease categories were found to have significant associated pathways, validated by Q–Q plots (p < 0.05). Protein–protein interactions of significant (p < 0.05) differentially expressed genes were visualized with the STRING program. Gene interaction networks were found to be relevant to SARS and influenza pathogenesis. In conclusion, we were able to identify the pathways potentially affected by or affecting SARS-CoV-2 infection in underlying medical conditions likely to confer susceptibility and/or the severity of COVID-19. Our findings have implications in future COVID-19 experimental research and treatment development.

Article activity feed

  1. SciScore for 10.1101/2020.09.14.20192609: (What is this?)

    Please note, not all rigor criteria are appropriate for all manuscripts.

    Table 1: Rigor

    Institutional Review Board Statementnot detected.
    Randomizationnot detected.
    Blindingnot detected.
    Power Analysisnot detected.
    Sex as a biological variablenot detected.

    Table 2: Resources

    Software and Algorithms
    SentencesResources
    Ensembl’s Variant Effect Predictor program (VEP) [34] was used to analyze MAGMAv1.07b annotation files for each gene set associated with comorbidities [35].
    Ensembl’s Variant Effect Predictor
    suggested: None
    Variant
    suggested: (VARIANT, RRID:SCR_005194)
    Corresponding tables were merged via Pythonv3.8.2 and SNPs containing a Sorting Intolerant from Tolerant (SIFT) score of 0 and a Polymorphism Phenotyping2 (PolyPhen2) score of 1, were removed (Supplemental Data File 1).
    Pythonv3.8.2
    suggested: None
    SIFT
    suggested: (SIFT, RRID:SCR_012813)
    Human Genome Organization (HUGO) gene symbols were extracted from the table with remaining SIFT and PolyPhen2 scores.
    PolyPhen2
    suggested: None
    Genes and their corresponding Entrez ID’s were then matched to significant genes’ Entrez IDs found through combined MAGMAv1.07b - STRING analysis.
    STRING
    suggested: (STRING, RRID:SCR_005223)
    Transcriptional gene expression analysis: GEO2R [39] was used to test the top 250 human mRNA gene expressions for each comorbidity based on available human data using NCBI GEO[39], by only including comorbidities that had significant pathways identified by MAGMAv1.07b and VEP STRING analyses.
    GEO2R
    suggested: (GEO2R, RRID:SCR_016569)
    NCBI
    suggested: (NCBI, RRID:SCR_006472)
    Gene involvement in influenza and/or SARS: Significant genes (n=119) were investigated to determine their roles in relation to influenza and/or SARS respiratory viral infections.
    SARS
    suggested: None
    Genes were cross referenced using Pubmed [43] literature searches, DisGeNETv6 [44], Influenza Research Database[45] and conventional Google searches including HUGO gene symbol and either “influenza” or “SARS” [46, 47].
    Pubmed
    suggested: (PubMed, RRID:SCR_004846)
    HUGO
    suggested: (HUGO, RRID:SCR_012800)
    Human tissue expression relevant to COVID-19 for genes with direct involvement was validated using Ensembl Expression Atlas [49, 50].
    Ensembl
    suggested: (Ensembl, RRID:SCR_002344)

    Results from OddPub: Thank you for sharing your data.


    Results from LimitationRecognizer: We detected the following sentences addressing limitations in the study:
    Limitations: While there is no shortage of publicly available data, not all diseases have the same level of dedicated research. Therefore, not all possible comorbidities had publicly available SNP datasets from GWAS catalog or human mRNA gene expression datasets from NCBI’s GEO datasets database. This resulted in a large decrease from 258 possible comorbidities to 141. Additionally, we were only able to use 19 of 22 significant comorbidities for GEO2R analysis and heatmap visualization. Another caveat is that GEO2R mRNA expression datasets have been generated through different independent studies using different genomic platforms and analysis pipelines, so that optimal normalization of raw data cannot be implemented. Little is still known about COVID-19 pathogenesis, although research on the matter has increased greatly since the beginning of the pandemic. Conclusions: Significant pathways were identified associated with comorbidities/underlying medical conditions conferring susceptibility and/or severity to SARS-CoV-2 infection, which have been reported in conjunction with decreased clinical outcomes. Our findings may have implications in development of COVID-19 therapies.

    Results from TrialIdentifier: No clinical trial numbers were referenced.


    Results from Barzooka: We did not find any issues relating to the usage of bar graphs.


    Results from JetFighter: We did not find any issues relating to colormaps.


    Results from rtransparent:
    • Thank you for including a conflict of interest statement. Authors are encouraged to include this statement when submitting to a journal.
    • Thank you for including a funding statement. Authors are encouraged to include this statement when submitting to a journal.
    • No protocol registration statement was detected.

    About SciScore

    SciScore is an automated tool that is designed to assist expert reviewers by finding and presenting formulaic information scattered throughout a paper in a standard, easy to digest format. SciScore checks for the presence and correctness of RRIDs (research resource identifiers), and for rigor criteria such as sex and investigator blinding. For details on the theoretical underpinning of rigor criteria and the tools shown here, including references cited, please follow this link.