Integrating single-cell sequencing data with GWAS summary statistics reveals CD16+monocytes and memory CD8+T cells involved in severe COVID-19

This article has been Reviewed by the following groups

Read the full article See related articles

Abstract

Background

Understanding the host genetic architecture and viral immunity contributes to the development of effective vaccines and therapeutics for controlling the COVID-19 pandemic. Alterations of immune responses in peripheral blood mononuclear cells play a crucial role in the detrimental progression of COVID-19. However, the effects of host genetic factors on immune responses for severe COVID-19 remain largely unknown.

Methods

We constructed a computational framework to characterize the host genetics that influence immune cell subpopulations for severe COVID-19 by integrating GWAS summary statistics ( N = 969,689 samples) with four independent scRNA-seq datasets containing healthy controls and patients with mild, moderate, and severe symptom ( N = 606,534 cells). We collected 10 predefined gene sets including inflammatory and cytokine genes to calculate cell state score for evaluating the immunological features of individual immune cells.

Results

We found that 34 risk genes were significantly associated with severe COVID-19, and the number of highly expressed genes increased with the severity of COVID-19. Three cell subtypes that are CD16+monocytes, megakaryocytes, and memory CD8+T cells were significantly enriched by COVID-19-related genetic association signals. Notably, three causal risk genes of CCR1 , CXCR6 , and ABO were highly expressed in these three cell types, respectively. CCR1 + CD16+monocytes and ABO + megakaryocytes with significantly up-regulated genes, including S100A12 , S100A8 , S100A9 , and IFITM1 , confer higher risk to the dysregulated immune response among severe patients. CXCR6 + memory CD8+ T cells exhibit a notable polyfunctionality including elevation of proliferation, migration, and chemotaxis. Moreover, we observed an increase in cell-cell interactions of both CCR1 + CD16+monocytes and CXCR6 + memory CD8+T cells in severe patients compared to normal controls among both PBMCs and lung tissues. The enhanced interactions of CXCR6 + memory CD8+T cells with epithelial cells facilitate the recruitment of this specific population of T cells to airways, promoting CD8+T cell-mediated immunity against COVID-19 infection.

Conclusions

We uncover a major genetics-modulated immunological shift between mild and severe infection, including an elevated expression of genetics-risk genes, increase in inflammatory cytokines, and of functional immune cell subsets aggravating disease severity, which provides novel insights into parsing the host genetic determinants that influence peripheral immune cells in severe COVID-19.

Article activity feed

  1. SciScore for 10.1101/2022.02.06.21266924: (What is this?)

    Please note, not all rigor criteria are appropriate for all manuscripts.

    Table 1: Rigor

    NIH rigor criteria are not applicable to paper type.

    Table 2: Resources

    Software and Algorithms
    SentencesResources
    Single cell RNA-seq data on severe COVID-19: In this study, we downloaded four independent scRNA-seq datasets on COVID-19 in PBMC and BALF from the ArrayExpress database (Dataset #1, the accession number is E-MTAB-9357 from Su et al. study [10]), and the Gene Expression Omnibus (GEO) database (Dataset #2, the accession number is GSE149689 from Lee et al. study [
    ArrayExpress
    suggested: (ArrayExpress, RRID:SCR_002964)
    Gene Expression Omnibus
    suggested: (Gene Expression Omnibus (GEO, RRID:SCR_005012)
    Single cell RNA sequencing data processing: We performed normalization, clustering, and dimensionality reduction, differential expression gene (DEG) analysis, and visualization on these four independent scRNA-seq datasets with the Seurat R package [30].
    Seurat
    suggested: (SEURAT, RRID:SCR_007322)
    We used the qqman R package to figure both Manhattan plot and quantile-quantile (QQ) plot, and the web-based software of LocusZoom (http://locuszoom.sph.umich.edu/)[34] to visualize the regional association plots for significant risk loci.
    LocusZoom
    suggested: (LOCUSZOOM, RRID:SCR_009257)
    Additionally, we leveraged the over-representation algorithm of the WebGestalt (http://www.webgestalt.org) [36] along with the significant genes as an input list to conduct a pathway enrichment analysis using the KEGG pathway resource [37].
    WebGestalt
    suggested: None
    http://www.webgestalt.org
    suggested: (WebGestalt: WEB-based GEne SeT AnaLysis Toolkit, RRID:SCR_006786)
    KEGG
    suggested: (KEGG, RRID:SCR_012773)
    In silico permutation analysis: To explore the concordance of results from both MAGMA analysis (Gene set #1: N = 944, P ≤ 0.05) and S-MultiXcan analysis (Gene set #2: N =1,274, P ≤ 0.05), we performed an in silico permutation analysis which consisted 100,000 times (N Total) random selections [41, 42].
    MAGMA
    suggested: (MAGMA, RRID:SCR_005757)
    Drug-gene interaction analysis: We conducted a drug-gene interaction analysis for identified genetics-risk genes by using protein-chemical interactions in the context of STRING-based PPI networks [43] and STITCH-based drug annotation information (v5.0, http://stitch.embl.de/) [44].
    http://stitch.embl.de/
    suggested: (Search Tool for Interactions of Chemicals, RRID:SCR_007947)
    The PLINK (v1.90) [46] was used to calculate the LD between SNPs within the 1 Mb window based on the 1,000 Genome Project European Phase 3 panel [33].
    PLINK
    suggested: (PLINK, RRID:SCR_001757)
    Cell-to-cell interaction analysis: To identify potential cellular interactions of CCR1+ CD16+monocytes and CXCR6+ memory CD8+T cells with other immune cells, we utilized the CellChat R package [47] for inferring the predicted cell-to-cell communications based on two normalized scRNA-seq datasets (dataset #1 of PBMC and dataset #4 of BALF).
    CellChat
    suggested: (CellChat, RRID:SCR_021946)
    The GTEx eQTL data (version 8) were downloaded from Zenodo repository (https://zenodo.org/record/3518299#.
    Zenodo
    suggested: (ZENODO, RRID:SCR_004129)

    Results from OddPub: Thank you for sharing your code.


    Results from LimitationRecognizer: We detected the following sentences addressing limitations in the study:
    To reduce the influence of this limitation, we adopted a widely-used approach by integrating a large-scale GWAS summary statistics with enormous amount of single cell sequencing data, as referenced in previous studies [45, 78]. Based on our findings suggesting that host genetic components exert regulatory effects on immunological dysregulations for SRAS-CoV-2 infection, more studies are warranted for exploring the genetic modification of peripheral T cells to defend against lethal severe COVID-19.

    Results from TrialIdentifier: No clinical trial numbers were referenced.


    Results from Barzooka: We found bar graphs of continuous data. We recommend replacing bar graphs with more informative graphics, as many different datasets can lead to the same bar graph. The actual data may suggest different conclusions from the summary statistics. For more information, please see Weissgerber et al (2015).


    Results from JetFighter: We did not find any issues relating to colormaps.


    Results from rtransparent:
    • Thank you for including a conflict of interest statement. Authors are encouraged to include this statement when submitting to a journal.
    • Thank you for including a funding statement. Authors are encouraged to include this statement when submitting to a journal.
    • No protocol registration statement was detected.

    Results from scite Reference Check: We found no unreliable references.


    About SciScore

    SciScore is an automated tool that is designed to assist expert reviewers by finding and presenting formulaic information scattered throughout a paper in a standard, easy to digest format. SciScore checks for the presence and correctness of RRIDs (research resource identifiers), and for rigor criteria such as sex and investigator blinding. For details on the theoretical underpinning of rigor criteria and the tools shown here, including references cited, please follow this link.