Comparative transcriptome analyses reveal genes associated with SARS-CoV-2 infection of human lung epithelial cells

This article has been Reviewed by the following groups

Read the full article See related articles

Abstract

During 2020, understanding the molecular mechanism of SARS-CoV-2 infection (the cause of COVID-19) became a scientific priority due to the devastating effects of the COVID-19. Many researchers have studied the effect of this viral infection on lung epithelial transcriptomes and deposited data in public repositories. Comprehensive analysis of such data could pave the way for development of efficient vaccines and effective drugs. In the current study, we obtained high-throughput gene expression data associated with human lung epithelial cells infected with respiratory viruses such as SARS-CoV-2, SARS, H1N1, avian influenza, rhinovirus and Dhori, then performed comparative transcriptome analysis to identify SARS-CoV-2 exclusive genes. The analysis yielded seven SARS-CoV-2 specific genes including CSF2 [GM-CSF] (colony-stimulating factor 2) and calcium-binding proteins (such as S100A8 and S100A9), which are known to be involved in respiratory diseases. The analyses showed that genes involved in inflammation are commonly altered by infection of SARS-CoV-2 and influenza viruses. Furthermore, results of protein–protein interaction analyses were consistent with a functional role of CSF2 and S100A9 in COVID-19 disease. In conclusion, our analysis revealed cellular genes associated with SARS-CoV-2 infection of the human lung epithelium; these are potential therapeutic targets.

Article activity feed

  1. SciScore for 10.1101/2020.06.24.169268: (What is this?)

    Please note, not all rigor criteria are appropriate for all manuscripts.

    Table 1: Rigor

    NIH rigor criteria are not applicable to paper type.

    Table 2: Resources

    Experimental Models: Cell Lines
    SentencesResources
    GSE71766 comprised human bronchial epithelial cells (BEAS-2B) infected with rhino virus (RV), influenza virus (H1N1), or both (RV + H1N1) (16).
    BEAS-2B
    suggested: None
    Bronchial epithelial cell line 2B4 (a clonal derivative of Calu-3 cells) infected with SARS-CoV or Dhori virus (DOHV) were part of GSE17400 (17).
    Calu-3
    suggested: RRID:CVCL_YZ47)
    Software and Algorithms
    SentencesResources
    Raw sequencing data related to selected samples of GSE147507 as fastq files were downloaded from Sequence Read Archive (SRA) using fastq-dump of sratoolkit v2.9.6 [http://ncbi.github.io/sra-tools/].
    Sequence Read Archive
    suggested: (DDBJ Sequence Read Archive, RRID:SCR_001370)
    First, raw sequencing reads were trimmed to remove adapter sequences and low-quality regions using Trim Galore!
    Trim Galore
    suggested: (Trim Galore, RRID:SCR_011847)
    Trimmed reads were subjected to quality control analysis using FastQC [https://www.bioinformatics.babraham.ac.uk/projects/fastqc/].
    FastQC
    suggested: (FastQC, RRID:SCR_014583)
    Tophat v2.1 was used to map trimmed raw reads to the human reference genome (hg38) (10).
    Tophat
    suggested: (TopHat, RRID:SCR_013035)
    All bam files from multiple runs related to the same samples were merged and sorted using SAMtools
    SAMtools
    suggested: (SAMTOOLS, RRID:SCR_002105)
    Finally, raw read counts were enumerated for each gene in each sample using HTSeq-count (12).
    HTSeq-count
    suggested: (htseq-count, RRID:SCR_011867)
    Analysis of differential expression was performed using DESeq2 according to a standard protocol [https://bioconductor.org/packages/release/bioc/vignettes/DESeq2/inst/doc/DESeq2.html] (13)
    DESeq2
    suggested: (DESeq, RRID:SCR_000154)
    Gene ontology enrichment analyses of the Differentially Expressed Genes (DEGs) were accomplished by use of the Database for Annotation, Visualization and Integrated Discovery (DAVID) v6.8 online tool (14)
    DAVID
    suggested: (DAVID, RRID:SCR_001881)
    Microarray data collection and analysis: The NCBI GEO database was queried for microarray data related to SARS-CoV infections of human lung epithelial cells.
    NCBI GEO
    suggested: None
    GEO2R was used to identify differentially expressed genes for each of these studies independently (9).
    GEO2R
    suggested: (GEO2R, RRID:SCR_016569)
    Protein-protein interaction analysis: STRING, a database of known or predicted protein-protein interactions (PPIs) was used to obtain interactions between genes altered on SARS-CoV-2 infection (20).
    STRING
    suggested: (STRING, RRID:SCR_005223)
    Output from the STRING database was uploaded to Cytoscape v3.7.2 in simple interaction format, and the Cytohubba app was employed to identify hub genes (21-23).
    Cytoscape
    suggested: (Cytoscape, RRID:SCR_003032)
    Cytohubba
    suggested: (cytoHubba, RRID:SCR_017677)
    We also checked the DrugBank database to determine if a drug is available to target them (24).
    DrugBank
    suggested: (DrugBank, RRID:SCR_002700)

    Results from OddPub: We did not detect open data. We also did not detect open code. Researchers are encouraged to share open data when possible (see Nature blog).


    Results from LimitationRecognizer: An explicit section about the limitations of the techniques employed in this study was not found. We encourage authors to address study limitations.

    Results from TrialIdentifier: No clinical trial numbers were referenced.


    Results from Barzooka: We did not find any issues relating to the usage of bar graphs.


    Results from JetFighter: We did not find any issues relating to colormaps.


    Results from rtransparent:
    • Thank you for including a conflict of interest statement. Authors are encouraged to include this statement when submitting to a journal.
    • Thank you for including a funding statement. Authors are encouraged to include this statement when submitting to a journal.
    • No protocol registration statement was detected.

    About SciScore

    SciScore is an automated tool that is designed to assist expert reviewers by finding and presenting formulaic information scattered throughout a paper in a standard, easy to digest format. SciScore checks for the presence and correctness of RRIDs (research resource identifiers), and for rigor criteria such as sex and investigator blinding. For details on the theoretical underpinning of rigor criteria and the tools shown here, including references cited, please follow this link.