Meta-analysis of virus-induced host gene expression reveals unique signatures of immune dysregulation induced by SARS-CoV-2

This article has been Reviewed by the following groups

Read the full article See related articles

Discuss this preprint

Start a discussion What are Sciety discussions?

Abstract

The clinical outcome of COVID-19 has an extreme age, genetic and comorbidity bias that is thought to be driven by an impaired immune response to SARS-CoV-2, the causative agent of the disease. The unprecedented impact of COVID-19 on global health has resulted in multiple studies generating extensive gene expression datasets in a relatively short period of time. In order to better understand the immune dysregulation induced by SARS-CoV-2, we carried out a meta-analysis of these transcriptomics data available in the published literature. Datasets included both those available from SARS-CoV-2 infected cell lines in vitro and those from patient samples. We focused our analysis on the identification of viral perturbed host functions as captured by co-expressed gene module analysis. Transcriptomics data from lung biopsies and nasopharyngeal samples, as opposed to those available from other clinical samples and infected cell lines, provided key signatures on the role of the host’s immune response on COVID-19 pathogenesis. For example, severity of infection and patients’ age are linked to the absence of stimulation of the RIG-I-like receptor signaling pathway, a known critical immediate line of defense against RNA viral infections that triggers type-I interferon responses. In addition, co-expression analysis of age-stratified transcriptional data provided evidence that signatures of key immune response pathways are perturbed in older COVID-19 patients. In particular, dysregulation of antigen-presenting components, down-regulation of cell cycle mechanisms and signatures of hyper-enriched monocytes were strongly correlated with the age of older individuals infected with SARS-CoV-2. Collectively, our meta-analysis highlights the ability of transcriptomics and gene-module analysis of aggregated datasets to aid our improved understanding of the host-specific disease mechanisms underpinning COVID-19.

Article activity feed

  1. SciScore for 10.1101/2020.12.29.424739: (What is this?)

    Please note, not all rigor criteria are appropriate for all manuscripts.

    Table 1: Rigor

    Institutional Review Board Statementnot detected.
    Randomizationnot detected.
    Blindingnot detected.
    Power Analysisnot detected.
    Sex as a biological variablenot detected.
    Cell Line Authenticationnot detected.

    Table 2: Resources

    Experimental Models: Cell Lines
    SentencesResources
    SARS-CoV-2 cell line and lung biopsy data (Blanco-Melo et al. 2020; Wyler et al. 2020), SARS-CoV-1 (Josset et al. 2013; Sims et al. 2013; Frieman et al. 2014) and MERS-CoV (Josset et al. 2013; Frieman et al. 2013) were downloaded from NCBI GEO using the following accession numbers: GSE147507, GSE45042, GSE33267, GSE56192, GSE148729.
    SARS-CoV-2
    suggested: None
    Software and Algorithms
    SentencesResources
    The SARS-CoV-2 PBMC-BALF transcriptome patient dataset (Xiong et al. 2020) was downloaded from the Genome Sequence Archive (https://bigd.big.ac.cn/) using the accession number CRA002390.
    Genome Sequence Archive
    suggested: None
    The paired end reads were mapped onto the human hg38 genome using the STAR aligner (Dobin et al. 2013).
    STAR
    suggested: (STAR, RRID:SCR_015899)
    The resulting mapped reads were quantified using the featureCounts program (Liao et al. 2014) in the Subread R package (Liao et al. 2019).
    featureCounts
    suggested: (featureCounts, RRID:SCR_012919)
    As an additional preprocessing step, RNA-seq counts were scaled and normalized by the TMM (trimmed mean of M-values) method of edgeR package (Robinson et al. 2010) and log transformed using voom (Law et al. 2014), followed by differential expression analysis using the limma workflow.
    edgeR
    suggested: (edgeR, RRID:SCR_012802)
    The differential gene lists containing gene identifiers, log transformed fold change and their corresponding false discovery rates (FDR) were analysed with the Ingenuity Pathway Analysis (IPA) software (https://www.qiagenbioinformatics.com/products/ingenuity-pathway-analysis) and differentially regulated canonical pathways identified.
    Ingenuity Pathway Analysis
    suggested: (Ingenuity Pathway Analysis, RRID:SCR_008653)
    We used the voom function in the limma R package to correct for batch and gender and retained the effect of viral load to stratify data.
    limma
    suggested: (LIMMA, RRID:SCR_010943)
    To determine the functional roles of the constructed WGCNA modules, the Fisher exact test was used, implemented in the GeneOverlap package (Shen 2020), using BTMs as reference modules and enriched modules across age groups were compared.
    GeneOverlap
    suggested: (GeneOverlap, RRID:SCR_018419)

    Results from OddPub: We did not detect open data. We also did not detect open code. Researchers are encouraged to share open data when possible (see Nature blog).


    Results from LimitationRecognizer: An explicit section about the limitations of the techniques employed in this study was not found. We encourage authors to address study limitations.

    Results from TrialIdentifier: No clinical trial numbers were referenced.


    Results from Barzooka: We did not find any issues relating to the usage of bar graphs.


    Results from JetFighter: Please consider improving the rainbow (“jet”) colormap(s) used on page 43. At least one figure is not accessible to readers with colorblindness and/or is not true to the data, i.e. not perceptually uniform.


    Results from rtransparent:
    • Thank you for including a conflict of interest statement. Authors are encouraged to include this statement when submitting to a journal.
    • No funding statement was detected.
    • No protocol registration statement was detected.

    About SciScore

    SciScore is an automated tool that is designed to assist expert reviewers by finding and presenting formulaic information scattered throughout a paper in a standard, easy to digest format. SciScore checks for the presence and correctness of RRIDs (research resource identifiers), and for rigor criteria such as sex and investigator blinding. For details on the theoretical underpinning of rigor criteria and the tools shown here, including references cited, please follow this link.