Identifying The “Core” Transcriptome of SARS-CoV-2 Infected Cells

This article has been Reviewed by the following groups

Read the full article

Abstract

In 2019, the severe acute respiratory syndrome coronavirus 2 (SARS-CoV-2) first emerged, causing the COVID-19 pandemic. Consequently, ongoing research has focused on better understanding the mechanisms underlying the symptoms of this disease. Although COVID-19 symptoms span a range of organ systems, the specific changes in gene regulation that lead to the variety of symptoms are still unclear. In our study, we used publicly available transcriptome data from previous studies on SARS-CoV-2 to identify commonly regulated genes across cardiomyocytes, human bronchial epithelial cells, alveolar type II cells, lung adenocarcinoma, human embryonic kidney cells, and patient samples. Additionally, using this common “core” transcriptome, we could identify the genes that were specifically and uniquely regulated in bronchial epithelial cells, embryonic kidney cells, or cardiomyocytes. For example, we found that genes related to cell metabolism were uniquely upregulated in kidney cells, providing us with the first mechanistic clue about specifically how kidney cells may be affected by SARS-CoV-2. Overall, our results uncover connections between the differential gene regulation in various cell types in response to the SARS-CoV-2 infection and help identify targets of potential therapeutics.

Article activity feed

  1. SciScore for 10.1101/2021.09.22.461142: (What is this?)

    Please note, not all rigor criteria are appropriate for all manuscripts.

    Table 1: Rigor

    NIH rigor criteria are not applicable to paper type.

    Table 2: Resources

    Experimental Models: Cell Lines
    SentencesResources
    The SRRs selected for this study (Table S1) were from patient lung samples and eight cell lines: human embryonic kidney 293 cells with SV40 large T antigen (HEK 293T), human induced pluripotent stem cell-derived cardiomyocytes (hiPSC-CMs), human induced pluripotent stem cell-derived alveolar type II epithelial-like cells (iAT2s), iPSCs, Vero E6, primary human bronchial epithelial cells (NHBE), human alveolar epithelial cells (lung adenocarcinoma) located on basal side (A549), and human bronchial epithelial cells (lung adenocarcinoma) located on apical side (Calu-3).
    human embryonic kidney 293
    suggested: None
    Vero E6
    suggested: None
    Determining biological significance of the “core” transcriptome using g:Profiler and R code to generate PCA plots and clusters of GO terms: Given the too small sample size, the uniquely downregulated genes from the HEK 293T cell line analysis as well as both the upregulated and downregulated genes from the NHBE cell line analysis were examined individually in order to understand the likely biological role of these genes in viral infection.
    HEK 293T
    suggested: CCLV Cat# CCLV-RIE 1018, RRID:CVCL_0063)
    Software and Algorithms
    SentencesResources
    Choosing BioProjects: Human transcriptomic data for uninfected and SARS-CoV-2 infected samples were obtained from the NCBI BioProject database.
    NCBI BioProject
    suggested: (NCBI BioProject, RRID:SCR_004801)
    A BioProject was chosen for the study if there were at least two replicates (at least two different SRR numbers) for both uninfected and SARS-CoV-2 infected samples.
    BioProject
    suggested: (NCBI BioProject, RRID:SCR_004801)
    The key steps in the Galaxy workflow are FASTQC, Trimmomatic, HISAT2, and FeatureCounts.
    FeatureCounts
    suggested: (featureCounts, RRID:SCR_012919)
    First, FASTQC is run on the datasets to check for the sequence files’ qualities.
    FASTQC
    suggested: (FastQC, RRID:SCR_014583)
    Second, the Trimmomatic tool is used to remove Illumina adapter sequences from the reads, to trim the low-quality sequences from either end of the reads, and to remove any sequences with a less than 25 nt length.
    Trimmomatic
    suggested: (Trimmomatic, RRID:SCR_011848)
    Third, HISAT2 gives the overall alignment rate, thereby allowing the user to know how much of each sequence file maps back to the human genome.
    HISAT2
    suggested: (HISAT2, RRID:SCR_015530)
    Choosing SRRs: All SRRs present in a particular BioProject for both uninfected and SARS-CoV-2 infected samples were selected for an initial run-through of the experimental workflow in Galaxy.
    Galaxy
    suggested: (Galaxy, RRID:SCR_006281)
    In each DESeq2 analysis, the counts tables (generated from the FeatureCounts step) of the replicates of a cell line were compared based on one factor, “Infection”, with two levels: “Uninfected” and “Infected”.
    DESeq2
    suggested: (DESeq, RRID:SCR_000154)
    In order to screen for differentially regulated genes that are unique to each cell line, the upregulated and downregulated genes from each of the three selected cell lines were compared with the corresponding upregulated and downregulated genes from the combined analysis using Microsoft Excel.
    Microsoft Excel
    suggested: (Microsoft Excel, RRID:SCR_016137)
    Instead, a gene ontology (GO) analysis was performed on those sets of results by inputting each list of genes into g:Profiler, a web server for functional enrichment analysis, to obtain gene ontology (GO) terms from each of the three sub-ontologies: Biological Process (BP), Cellular Component (CC), and Molecular Function (MF) [21].
    g:Profiler
    suggested: (G:Profiler, RRID:SCR_006809)

    Results from OddPub: We did not detect open data. We also did not detect open code. Researchers are encouraged to share open data when possible (see Nature blog).


    Results from LimitationRecognizer: An explicit section about the limitations of the techniques employed in this study was not found. We encourage authors to address study limitations.

    Results from TrialIdentifier: No clinical trial numbers were referenced.


    Results from Barzooka: We did not find any issues relating to the usage of bar graphs.


    Results from JetFighter: We did not find any issues relating to colormaps.


    Results from rtransparent:
    • Thank you for including a conflict of interest statement. Authors are encouraged to include this statement when submitting to a journal.
    • No funding statement was detected.
    • No protocol registration statement was detected.

    Results from scite Reference Check: We found no unreliable references.


    About SciScore

    SciScore is an automated tool that is designed to assist expert reviewers by finding and presenting formulaic information scattered throughout a paper in a standard, easy to digest format. SciScore checks for the presence and correctness of RRIDs (research resource identifiers), and for rigor criteria such as sex and investigator blinding. For details on the theoretical underpinning of rigor criteria and the tools shown here, including references cited, please follow this link.