Comparative transcriptome analyses reveal genes associated with SARS-CoV-2 infection of human lung epithelial cells

This article has been Reviewed by the following groups

Read the full article

Discuss this preprint

Start a discussion What are Sciety discussions?

Abstract

Understanding the molecular mechanism of SARS-CoV-2 infection (the cause of COVID-19) is a scientific priority for 2020. Various research groups are working toward development of vaccines and drugs, and many have published genomic and transcriptomic data related to this viral infection. The power inherent in publicly available data can be demonstrated via comparative transcriptome analyses. In the current study, we collected high-throughput gene expression data related to human lung epithelial cells infected with SARS-CoV-2 or other respiratory viruses (SARS, H1N1, rhinovirus, avian influenza, and Dhori) and compared the effect of these viruses on the human transcriptome. The analyses identified fifteen genes specifically expressed in cells transfected with SARS-CoV-2; these included CSF2 (colony-stimulating factor 2) and S100A8 and S100A9 (calcium-binding proteins), all of which are involved in lung/respiratory disorders. The analyses showed that genes involved in the Type1 interferon signaling pathway and the apoptosis process are commonly altered by infection of SARS-CoV-2 and influenza viruses. Furthermore, results of protein-protein interaction analyses were consistent with a functional role of CSF2 in COVID-19 disease. In conclusion, our analysis has revealed cellular genes associated with SARS-CoV-2 infection of the human lung epithelium; these are potential therapeutic targets.

Article activity feed

  1. SciScore for 10.1101/2020.06.24.169268: (What is this?)

    Please note, not all rigor criteria are appropriate for all manuscripts.

    Table 1: Rigor

    NIH rigor criteria are not applicable to paper type.

    Table 2: Resources

    Experimental Models: Cell Lines
    SentencesResources
    GSE71766 comprised human bronchial epithelial cells (BEAS-2B) infected with rhino virus (RV), influenza virus (H1N1), or both (RV + H1N1) (16).
    BEAS-2B
    suggested: None
    Bronchial epithelial cell line 2B4 (a clonal derivative of Calu-3 cells) infected with SARS-CoV or Dhori virus (DOHV) were part of GSE17400 (17).
    Calu-3
    suggested: RRID:CVCL_YZ47)
    Software and Algorithms
    SentencesResources
    Raw sequencing data related to selected samples of GSE147507 as fastq files were downloaded from Sequence Read Archive (SRA) using fastq-dump of sratoolkit v2.9.6 [http://ncbi.github.io/sra-tools/].
    Sequence Read Archive
    suggested: (DDBJ Sequence Read Archive, RRID:SCR_001370)
    First, raw sequencing reads were trimmed to remove adapter sequences and low-quality regions using Trim Galore!
    Trim Galore
    suggested: (Trim Galore, RRID:SCR_011847)
    Trimmed reads were subjected to quality control analysis using FastQC [https://www.bioinformatics.babraham.ac.uk/projects/fastqc/].
    FastQC
    suggested: (FastQC, RRID:SCR_014583)
    Tophat v2.1 was used to map trimmed raw reads to the human reference genome (hg38) (10).
    Tophat
    suggested: (TopHat, RRID:SCR_013035)
    All bam files from multiple runs related to the same samples were merged and sorted using SAMtools
    SAMtools
    suggested: (SAMTOOLS, RRID:SCR_002105)
    Finally, raw read counts were enumerated for each gene in each sample using HTSeq-count (12).
    HTSeq-count
    suggested: (htseq-count, RRID:SCR_011867)
    Analysis of differential expression was performed using DESeq2 according to a standard protocol [https://bioconductor.org/packages/release/bioc/vignettes/DESeq2/inst/doc/DESeq2.html] (13)
    DESeq2
    suggested: (DESeq, RRID:SCR_000154)
    Gene ontology enrichment analyses of the Differentially Expressed Genes (DEGs) were accomplished by use of the Database for Annotation, Visualization and Integrated Discovery (DAVID) v6.8 online tool (14)
    DAVID
    suggested: (DAVID, RRID:SCR_001881)
    Microarray data collection and analysis: The NCBI GEO database was queried for microarray data related to SARS-CoV infections of human lung epithelial cells.
    NCBI GEO
    suggested: None
    GEO2R was used to identify differentially expressed genes for each of these studies independently (9).
    GEO2R
    suggested: (GEO2R, RRID:SCR_016569)
    Protein-protein interaction analysis: STRING, a database of known or predicted protein-protein interactions (PPIs) was used to obtain interactions between genes altered on SARS-CoV-2 infection (20).
    STRING
    suggested: (STRING, RRID:SCR_005223)
    Output from the STRING database was uploaded to Cytoscape v3.7.2 in simple interaction format, and the Cytohubba app was employed to identify hub genes (21-23).
    Cytoscape
    suggested: (Cytoscape, RRID:SCR_003032)
    Cytohubba
    suggested: (cytoHubba, RRID:SCR_017677)
    We also checked the DrugBank database to determine if a drug is available to target them (24).
    DrugBank
    suggested: (DrugBank, RRID:SCR_002700)

    Results from OddPub: We did not detect open data. We also did not detect open code. Researchers are encouraged to share open data when possible (see Nature blog).


    Results from LimitationRecognizer: An explicit section about the limitations of the techniques employed in this study was not found. We encourage authors to address study limitations.

    Results from TrialIdentifier: No clinical trial numbers were referenced.


    Results from Barzooka: We did not find any issues relating to the usage of bar graphs.


    Results from JetFighter: We did not find any issues relating to colormaps.


    Results from rtransparent:
    • Thank you for including a conflict of interest statement. Authors are encouraged to include this statement when submitting to a journal.
    • Thank you for including a funding statement. Authors are encouraged to include this statement when submitting to a journal.
    • No protocol registration statement was detected.

    About SciScore

    SciScore is an automated tool that is designed to assist expert reviewers by finding and presenting formulaic information scattered throughout a paper in a standard, easy to digest format. SciScore checks for the presence and correctness of RRIDs (research resource identifiers), and for rigor criteria such as sex and investigator blinding. For details on the theoretical underpinning of rigor criteria and the tools shown here, including references cited, please follow this link.