Integrative analyses identify susceptibility genes underlying COVID-19 hospitalization

This article has been Reviewed by the following groups

Read the full article See related articles

Abstract

Despite rapid progress in characterizing the role of host genetics in SARS-Cov-2 infection, there is limited understanding of genes and pathways that contribute to COVID-19. Here, we integrated a genome-wide association study of COVID-19 hospitalization (7,885 cases and 961,804 controls from COVID-19 Host Genetics Initiative) with mRNA expression, splicing, and protein levels (n=18,502). We identified 27 genes related to inflammation and coagulation pathways whose genetically predicted expression was associated with COVID-19 hospitalization. We functionally characterized the 27 genes using phenome- and laboratory-wide association scans in Vanderbilt Biobank (BioVU; n=85,460) and identified coagulation-related clinical symptoms, immunologic, and blood-cell-related biomarkers. We replicated these findings across trans-ethnic studies and observed consistent effects in individuals of diverse ancestral backgrounds in BioVU, pan-UK Biobank, and Biobank Japan. Our study highlights putative causal genes impacting COVID-19 severity and symptomology through the host inflammatory response.

Article activity feed

  1. SciScore for 10.1101/2020.12.07.20245308: (What is this?)

    Please note, not all rigor criteria are appropriate for all manuscripts.

    Table 1: Rigor

    Institutional Review Board Statementnot detected.
    Randomizationnot detected.
    Blindingnot detected.
    Power AnalysisTo reduce the number of tests and increase statistical power, we restricted to genes whose protein levels exhibited evidence of genetic control by testing for non-zero cis-heritability (p-value < 0.05) using GCTA.
    Sex as a biological variablenot detected.

    Table 2: Resources

    Software and Algorithms
    SentencesResources
    We used the PheWAS package in R to perform logistic regressions to identify the phecodes that are significantly associated with imputed gene expression after adjusting for sex, age, and the top ten principal components from genetic data to control for population stratification (Denny et al. 2010, 2013).
    PheWAS
    suggested: (PheWAS Catalog, RRID:SCR_003562)
    We clumped SNPs in PLINK using eQTL/sQTL p-values (q-value ≦ 0.05) reported in GTEx v8 and limited pair-wise SNP correlations to r2=0.1 over 250kb windows.
    PLINK
    suggested: (PLINK, RRID:SCR_001757)

    Results from OddPub: We did not detect open data. We also did not detect open code. Researchers are encouraged to share open data when possible (see Nature blog).


    Results from LimitationRecognizer: We detected the following sentences addressing limitations in the study:
    Our results are consistent with previous studies investigating the impact of inflammation on severe COVID-19 outcomes; however, we note there are limitations. First, TWAS analyses rely on SNP-based predictive models of mRNA and alternative splicing trained using mostly European-ancestry individuals in GTEx v8 (29). While consistent with the ancestry makeup of COVID-19 HGI GWAS ((6); https://www.covid19hg.org/), applying these models to non-European individuals (e.g., African Americans in BioVU) will result in loss of power or bias due to different underlying linkage disequilibrium patterns. Second, TWAS uses mRNA, alternative splicing, or protein levels in bulk tissue, with cell-type effects likely to be missed. Third, TWAS assumes additivity of SNP effects on gene expression and downstream hospitalization risk, which ignores the possibility of epistatic and gene-environment interactions contributing to COVID-19 related hospitalization risk. Finally, our study focuses on the host genetic factors that contribute to severe COVID-19 but did not incorporate the social determinants of health that are known to influence risk for severe COVID-19. The biological insights identified here should not be interpreted as explanatory factors for disparity, but instead as key genomic pathways modulating host response to SARS-Cov-2 across populations. Functional studies of key genes identified are needed to identify mechanisms through which these genes influence COVID-19 related hospitalizati...

    Results from TrialIdentifier: No clinical trial numbers were referenced.


    Results from Barzooka: We did not find any issues relating to the usage of bar graphs.


    Results from JetFighter: We did not find any issues relating to colormaps.


    Results from rtransparent:
    • Thank you for including a conflict of interest statement. Authors are encouraged to include this statement when submitting to a journal.
    • Thank you for including a funding statement. Authors are encouraged to include this statement when submitting to a journal.
    • No protocol registration statement was detected.

    About SciScore

    SciScore is an automated tool that is designed to assist expert reviewers by finding and presenting formulaic information scattered throughout a paper in a standard, easy to digest format. SciScore checks for the presence and correctness of RRIDs (research resource identifiers), and for rigor criteria such as sex and investigator blinding. For details on the theoretical underpinning of rigor criteria and the tools shown here, including references cited, please follow this link.