Long-Read RNA Sequencing Identifies Polyadenylation Elongation and Differential Transcript Usage of Host Transcripts During SARS-CoV-2 In Vitro Infection
This article has been Reviewed by the following groups
Listed in
- Evaluated articles (ScreenIT)
Abstract
Better methods to interrogate host-pathogen interactions during Severe Acute Respiratory Syndrome Coronavirus 2 (SARS-CoV-2) infections are imperative to help understand and prevent this disease. Here we implemented RNA-sequencing (RNA-seq) using Oxford Nanopore Technologies (ONT) long-reads to measure differential host gene expression, transcript polyadenylation and isoform usage within various epithelial cell lines permissive and non-permissive for SARS-CoV-2 infection. SARS-CoV-2-infected and mock-infected Vero (African green monkey kidney epithelial cells), Calu-3 (human lung adenocarcinoma epithelial cells), Caco-2 (human colorectal adenocarcinoma epithelial cells) and A549 (human lung carcinoma epithelial cells) were analyzed over time (0, 2, 24, 48 hours). Differential polyadenylation was found to occur in both infected Calu-3 and Vero cells during a late time point (48 hpi), with Gene Ontology (GO) terms such as viral transcription and translation shown to be significantly enriched in Calu-3 data. Poly(A) tails showed increased lengths in the majority of the differentially polyadenylated transcripts in Calu-3 and Vero cell lines (up to ~101 nt in mean poly(A) length, padj = 0.029). Of these genes, ribosomal protein genes such as RPS4X and RPS6 also showed downregulation in expression levels, suggesting the importance of ribosomal protein genes during infection. Furthermore, differential transcript usage was identified in Caco-2, Calu-3 and Vero cells, including transcripts of genes such as GSDMB and KPNA2 , which have previously been implicated in SARS-CoV-2 infections. Overall, these results highlight the potential role of differential polyadenylation and transcript usage in host immune response or viral manipulation of host mechanisms during infection, and therefore, showcase the value of long-read sequencing in identifying less-explored host responses to disease.
Article activity feed
-
-
-
SciScore for 10.1101/2021.12.14.472725: (What is this?)
Please note, not all rigor criteria are appropriate for all manuscripts.
Table 1: Rigor
NIH rigor criteria are not applicable to paper type.Table 2: Resources
Experimental Models: Cell Lines Sentences Resources Additional datasets were generated for this study including PCR cDNA datasets for cell lines (Vero, Caco-2, Calu-3 and A549) and the direct RNA and direct cDNA datasets for A549. Verosuggested: NoneCalu-3suggested: NoneFor this current study, we additionally cultured A549 (human lung carcinoma epithelial – ATCC CCL-185) cells to supplement our main data, using similar methods. A549suggested: NoneBriefly, RNA from mock control and infected cells harvested at 0, 2, 24 and 48 hpi from Caco-2, Calu-3 and Vero cells was sequenced with the ONT Direct cDNA Sequencing Kit (SQK-DCS109) in … SciScore for 10.1101/2021.12.14.472725: (What is this?)
Please note, not all rigor criteria are appropriate for all manuscripts.
Table 1: Rigor
NIH rigor criteria are not applicable to paper type.Table 2: Resources
Experimental Models: Cell Lines Sentences Resources Additional datasets were generated for this study including PCR cDNA datasets for cell lines (Vero, Caco-2, Calu-3 and A549) and the direct RNA and direct cDNA datasets for A549. Verosuggested: NoneCalu-3suggested: NoneFor this current study, we additionally cultured A549 (human lung carcinoma epithelial – ATCC CCL-185) cells to supplement our main data, using similar methods. A549suggested: NoneBriefly, RNA from mock control and infected cells harvested at 0, 2, 24 and 48 hpi from Caco-2, Calu-3 and Vero cells was sequenced with the ONT Direct cDNA Sequencing Kit (SQK-DCS109) in conjunction with the Native Barcoding Kit (EXP-NBD104) Caco-2suggested: NoneSoftware and Algorithms Sentences Resources availability: ONT sequencing data (direct RNA and direct cDNA) for this study from cell lines (Vero, Caco-2 and Calu-3) was derived from our previous work (Chang et al., 2021), and is currently publicly available at NCBI repository BioProject PRJNA675370 BioProjectsuggested: (NCBI BioProject, RRID:SCR_004801)The results of individual analyses are available at Figtree DOI: 10.6084/m9.figshare.17139995 (differential expression), 10.6084/m9.figshare.16841794 (differential polyadenylation) and 10.6084/m9.figshare.17140007 (differential transcript usage). Figtreesuggested: (FigTree, RRID:SCR_008515)All direct RNA and direct cDNA libraries were loaded onto a R9.4.1 MinION flow cell and sequenced for 72 hrs using an ONT MinION or GridION. MinIONsuggested: (MinION, RRID:SCR_017985)All resulting FASTQ data were mapped using Minimap2 v2.17 (Heng Li, 2018) Minimap2suggested: (Minimap2, RRID:SCR_018550)Direct RNA-seq data was mapped to the combined genome (consisting of human/African green monkey genome from Ensembl (release 100), SARS-CoV-2 Australia virus (Australia/VIC01/2020, NCBI:MT007544.1) and the RNA sequin decoy chromosome genome (Hardwick et al., 2016) with the default direct RNA parameters ‘-ax splice -uf -k14 --secondary=no’ and for all cDNA datasets ‘-ax splice –secondary=no’. Ensemblsuggested: (Ensembl, RRID:SCR_002344)The resulting BAM files were sorted and indexed using Samtools v1.9 (H. Samtoolssuggested: (SAMTOOLS, RRID:SCR_002105)Counts files were generated using Featurecounts v2.0.0 (Liao, Smyth, & Shi, 2014) for genome-mapped cDNA data, and with Salmon v0.13.1 (Patro Featurecountssuggested: (featureCounts, RRID:SCR_012919)Salmonsuggested: (Salmon, RRID:SCR_017036)Differential expression analysis: DESeq2 was used to identify differentially expressed genes/transcripts from direct cDNA data. DESeq2suggested: (DESeq, RRID:SCR_000154)For the nanopolish analysis, all Caco-2, Calu-3 and Vero direct RNA BAM files mapped to the combined reference genome (host, sequin, virus) were indexed with the nanopolish v0.13.2 ‘index’ function with the command ‘nanopolish index -d $FAST5 -s $SEQUENCING_SUMMARY $FASTQ’. nanopolishsuggested: (Nanopolish, RRID:SCR_016157)Raincloud plots were generated for median poly(T) lengths of each gene with increased poly(A) length in the Calu-3 48 hpi dataset in both conditions (control and infected) using using ggplot2 v3.3.4 (Wickham, 2016) to replicate the raincloud plots generated by the raincloudplots package in R (Allen et al., 2021). ggplot2suggested: (ggplot2, RRID:SCR_014601)GO and KEGG pathway analysis: Significant biological GO biological terms and KEGG pathways were identified with genes that were found to be significantly differentially expressed and polyadenylated in the analyses above. KEGGsuggested: (KEGG, RRID:SCR_012773)Results from OddPub: Thank you for sharing your data.
Results from LimitationRecognizer: An explicit section about the limitations of the techniques employed in this study was not found. We encourage authors to address study limitations.Results from TrialIdentifier: No clinical trial numbers were referenced.
Results from Barzooka: We did not find any issues relating to the usage of bar graphs.
Results from JetFighter: We did not find any issues relating to colormaps.
Results from rtransparent:- Thank you for including a conflict of interest statement. Authors are encouraged to include this statement when submitting to a journal.
- Thank you for including a funding statement. Authors are encouraged to include this statement when submitting to a journal.
- No protocol registration statement was detected.
Results from scite Reference Check: We found no unreliable references.
-