An atlas connecting shared genetic architecture of human diseases and molecular phenotypes provides insight into COVID-19 susceptibility

This article has been Reviewed by the following groups

Read the full article See related articles

Abstract

Background

While genome-wide associations studies (GWAS) have successfully elucidated the genetic architecture of complex human traits and diseases, understanding mechanisms that lead from genetic variation to pathophysiology remains an important challenge. Methods are needed to systematically bridge this crucial gap to facilitate experimental testing of hypotheses and translation to clinical utility.

Results

Here, we leveraged cross-phenotype associations to identify traits with shared genetic architecture, using linkage disequilibrium (LD) information to accurately capture shared SNPs by proxy, and calculate significance of enrichment. This shared genetic architecture was examined across differing biological scales through incorporating data from catalogs of clinical, cellular, and molecular GWAS. We have created an interactive web database (interactive Cross-Phenotype Analysis of GWAS database (iCPAGdb)) to facilitate exploration and allow rapid analysis of user-uploaded GWAS summary statistics. This database revealed well-known relationships among phenotypes, as well as the generation of novel hypotheses to explain the pathophysiology of common diseases. Application of iCPAGdb to a recent GWAS of severe COVID-19 demonstrated unexpected overlap of GWAS signals between COVID-19 and human diseases, including with idiopathic pulmonary fibrosis driven by the DPP9 locus. Transcriptomics from peripheral blood of COVID-19 patients demonstrated that DPP9 was induced in SARS-CoV-2 compared to healthy controls or those with bacterial infection. Further investigation of cross-phenotype SNPs associated with both severe COVID-19 and other human traits demonstrated colocalization of the GWAS signal at the ABO locus with plasma protein levels of a reported receptor of SARS-CoV-2, CD209 (DC-SIGN). This finding points to a possible mechanism whereby glycosylation of CD209 by ABO may regulate COVID-19 disease severity.

Conclusions

Thus, connecting genetically related traits across phenotypic scales links human diseases to molecular and cellular measurements that can reveal mechanisms and lead to novel biomarkers and therapeutic approaches. The iCPAGdb web portal is accessible at http://cpag.oit.duke.edu and the software code at https://github.com/tbalmat/iCPAGdb .

Article activity feed

  1. SciScore for 10.1101/2020.12.20.20248572: (What is this?)

    Please note, not all rigor criteria are appropriate for all manuscripts.

    Table 1: Rigor

    NIH rigor criteria are not applicable to paper type.

    Table 2: Resources

    Software and Algorithms
    SentencesResources
    For duplicated SNPs with same variant rsID, we kept only the first variant by using “--rm-dup force-first” using PLINK 2.0, Cross-phenotype SNP analysis: Cross-phenotype SNPs were used to quantify the similarity of different traits.
    PLINK
    suggested: (PLINK, RRID:SCR_001757)
    STAR v 2.7.1 (Dobin et al., 2013) was used to align the short reads and generate the count matrix.
    STAR
    suggested: (STAR, RRID:SCR_015899)
    The back-end was written in python v3.6 with utilization of SQLite.
    python
    suggested: (IPython, RRID:SCR_001658)
    Important packages used in this mode are DT for construction of and interaction with tables and ggplot2, plotly, and heatmaply for basic plotting, interactive plotting (hover labels), and heatmap generation, respectively.
    ggplot2
    suggested: (ggplot2, RRID:SCR_014601)
    The web browser also allows users to upload their own GWAS summary data, and iCPAGdb will automatically perform LD clumping based on selected population and generate an atlas of connections for the user’s GWAS against > 4400 GWAS traits in the database.
    iCPAGdb
    suggested: None
    NHGRI GWAS Catalog: https://www.ebi.ac.uk/gwas/
    https://www.ebi.ac.uk/gwas/
    suggested: (GWAS: Catalog of Published Genome-Wide Association Studies, RRID:SCR_012745)
    d COVID-19 GWAS summary statistics from Ellinghaus et al. (2020): https://grasp.nhlbi.nih.gov/Covid19GWASResults.aspx IPF GWAS: download link was obtained by applying for access following the collaborative protocol from https://github.com/genomicsITER/PFgenetics Tools for visualization: R packages: ggplot2: https://cran.r-project.org/web/packages/ggplot2/ gggene: https://cran.r-project.org/web/packages/gggenes/index.html tidygraph: https://cran.r-project.org/web/packages/tidygraph/ ggnetwork: https://cran.r-project.org/web/packages/ggnetwork/ circlize: https://cran.r-project.org/web/packages/circlize/ ggpubr: https://cran.r-project.org/web/packages/ggpubr/ DT: https://cran.r-project.org/web/packages/DT plotly: https://cran.r-project.org/web/packages/plotly/ heatmaply: https://cran.r-project.org/web/packages/heatmaply/ promises: https://CRAN.R-project.org/package=promises Further information and requests for resources should be directed to and will be fulfilled by the Lead Contact, Dennis C. Ko (dennis.ko(at)duke.edu).
    https://cran.r-project.org/web/packages/circlize/
    suggested: (circlize, RRID:SCR_002141)

    Results from OddPub: Thank you for sharing your code.


    Results from LimitationRecognizer: An explicit section about the limitations of the techniques employed in this study was not found. We encourage authors to address study limitations.

    Results from TrialIdentifier: No clinical trial numbers were referenced.


    Results from Barzooka: We did not find any issues relating to the usage of bar graphs.


    Results from JetFighter: We did not find any issues relating to colormaps.


    Results from rtransparent:
    • Thank you for including a conflict of interest statement. Authors are encouraged to include this statement when submitting to a journal.
    • Thank you for including a funding statement. Authors are encouraged to include this statement when submitting to a journal.
    • No protocol registration statement was detected.

    About SciScore

    SciScore is an automated tool that is designed to assist expert reviewers by finding and presenting formulaic information scattered throughout a paper in a standard, easy to digest format. SciScore checks for the presence and correctness of RRIDs (research resource identifiers), and for rigor criteria such as sex and investigator blinding. For details on the theoretical underpinning of rigor criteria and the tools shown here, including references cited, please follow this link.