Long-Read RNA Sequencing Identifies Polyadenylation Elongation and Differential Transcript Usage of Host Transcripts During SARS-CoV-2 In Vitro Infection

Abstract

Better methods to interrogate host-pathogen interactions during Severe Acute Respiratory Syndrome Coronavirus 2 (SARS-CoV-2) infections are imperative to help understand and prevent this disease. Here we implemented RNA-sequencing (RNA-seq) using Oxford Nanopore Technologies (ONT) long-reads to measure differential host gene expression, transcript polyadenylation and isoform usage within various epithelial cell lines permissive and non-permissive for SARS-CoV-2 infection. SARS-CoV-2-infected and mock-infected Vero (African green monkey kidney epithelial cells), Calu-3 (human lung adenocarcinoma epithelial cells), Caco-2 (human colorectal adenocarcinoma epithelial cells) and A549 (human lung carcinoma epithelial cells) were analyzed over time (0, 2, 24, 48 hours). Differential polyadenylation was found to occur in both infected Calu-3 and Vero cells during a late time point (48 hpi), with Gene Ontology (GO) terms such as viral transcription and translation shown to be significantly enriched in Calu-3 data. Poly(A) tails showed increased lengths in the majority of the differentially polyadenylated transcripts in Calu-3 and Vero cell lines (up to ~101 nt in mean poly(A) length, padj = 0.029). Of these genes, ribosomal protein genes such as RPS4X and RPS6 also showed downregulation in expression levels, suggesting the importance of ribosomal protein genes during infection. Furthermore, differential transcript usage was identified in Caco-2, Calu-3 and Vero cells, including transcripts of genes such as GSDMB and KPNA2 , which have previously been implicated in SARS-CoV-2 infections. Overall, these results highlight the potential role of differential polyadenylation and transcript usage in host immune response or viral manipulation of host mechanisms during infection, and therefore, showcase the value of long-read sequencing in identifying less-explored host responses to disease.

SciScore for 10.1101/2021.12.14.472725: (What is this?)

Please note, not all rigor criteria are appropriate for all manuscripts.

Table 1: Rigor

NIH rigor criteria are not applicable to paper type.

Table 2: Resources

Experimental Models: Cell Lines
Sentences	Resources
Additional datasets were generated for this study including PCR cDNA datasets for cell lines (Vero, Caco-2, Calu-3 and A549) and the direct RNA and direct cDNA datasets for A549.	Vero suggested: None Calu-3 suggested: None
For this current study, we additionally cultured A549 (human lung carcinoma epithelial – ATCC CCL-185) cells to supplement our main data, using similar methods.	A549 suggested: None
Briefly, RNA from mock control and infected cells harvested at 0, 2, 24 and 48 hpi from Caco-2, Calu-3 and Vero cells was sequenced with the ONT Direct cDNA Sequencing Kit (SQK-DCS109) in …

SciScore for 10.1101/2021.12.14.472725: (What is this?)

Please note, not all rigor criteria are appropriate for all manuscripts.

Table 1: Rigor

NIH rigor criteria are not applicable to paper type.

Table 2: Resources

Experimental Models: Cell Lines
Sentences	Resources
Additional datasets were generated for this study including PCR cDNA datasets for cell lines (Vero, Caco-2, Calu-3 and A549) and the direct RNA and direct cDNA datasets for A549.	Vero suggested: None Calu-3 suggested: None
For this current study, we additionally cultured A549 (human lung carcinoma epithelial – ATCC CCL-185) cells to supplement our main data, using similar methods.	A549 suggested: None
Briefly, RNA from mock control and infected cells harvested at 0, 2, 24 and 48 hpi from Caco-2, Calu-3 and Vero cells was sequenced with the ONT Direct cDNA Sequencing Kit (SQK-DCS109) in conjunction with the Native Barcoding Kit (EXP-NBD104)	Caco-2 suggested: None
Software and Algorithms
Sentences	Resources
availability: ONT sequencing data (direct RNA and direct cDNA) for this study from cell lines (Vero, Caco-2 and Calu-3) was derived from our previous work (Chang et al., 2021), and is currently publicly available at NCBI repository BioProject PRJNA675370	BioProject suggested: (NCBI BioProject, RRID:SCR_004801)
The results of individual analyses are available at Figtree DOI: 10.6084/m9.figshare.17139995 (differential expression), 10.6084/m9.figshare.16841794 (differential polyadenylation) and 10.6084/m9.figshare.17140007 (differential transcript usage).	Figtree suggested: (FigTree, RRID:SCR_008515)
All direct RNA and direct cDNA libraries were loaded onto a R9.4.1 MinION flow cell and sequenced for 72 hrs using an ONT MinION or GridION.	MinION suggested: (MinION, RRID:SCR_017985)
All resulting FASTQ data were mapped using Minimap2 v2.17 (Heng Li, 2018)	Minimap2 suggested: (Minimap2, RRID:SCR_018550)
Direct RNA-seq data was mapped to the combined genome (consisting of human/African green monkey genome from Ensembl (release 100), SARS-CoV-2 Australia virus (Australia/VIC01/2020, NCBI:MT007544.1) and the RNA sequin decoy chromosome genome (Hardwick et al., 2016) with the default direct RNA parameters ‘-ax splice -uf -k14 --secondary=no’ and for all cDNA datasets ‘-ax splice –secondary=no’.	Ensembl suggested: (Ensembl, RRID:SCR_002344)
The resulting BAM files were sorted and indexed using Samtools v1.9 (H.	Samtools suggested: (SAMTOOLS, RRID:SCR_002105)
Counts files were generated using Featurecounts v2.0.0 (Liao, Smyth, & Shi, 2014) for genome-mapped cDNA data, and with Salmon v0.13.1 (Patro	Featurecounts suggested: (featureCounts, RRID:SCR_012919) Salmon suggested: (Salmon, RRID:SCR_017036)
Differential expression analysis: DESeq2 was used to identify differentially expressed genes/transcripts from direct cDNA data.	DESeq2 suggested: (DESeq, RRID:SCR_000154)
For the nanopolish analysis, all Caco-2, Calu-3 and Vero direct RNA BAM files mapped to the combined reference genome (host, sequin, virus) were indexed with the nanopolish v0.13.2 ‘index’ function with the command ‘nanopolish index -d $FAST5 -s $SEQUENCING_SUMMARY $FASTQ’.	nanopolish suggested: (Nanopolish, RRID:SCR_016157)
Raincloud plots were generated for median poly(T) lengths of each gene with increased poly(A) length in the Calu-3 48 hpi dataset in both conditions (control and infected) using using ggplot2 v3.3.4 (Wickham, 2016) to replicate the raincloud plots generated by the raincloudplots package in R (Allen et al., 2021).	ggplot2 suggested: (ggplot2, RRID:SCR_014601)
GO and KEGG pathway analysis: Significant biological GO biological terms and KEGG pathways were identified with genes that were found to be significantly differentially expressed and polyadenylated in the analyses above.	KEGG suggested: (KEGG, RRID:SCR_012773)

Results from OddPub: Thank you for sharing your data.

Results from LimitationRecognizer: An explicit section about the limitations of the techniques employed in this study was not found. We encourage authors to address study limitations.

Results from TrialIdentifier: No clinical trial numbers were referenced.

Results from Barzooka: We did not find any issues relating to the usage of bar graphs.

Results from JetFighter: We did not find any issues relating to colormaps.

Results from rtransparent:

Thank you for including a conflict of interest statement. Authors are encouraged to include this statement when submitting to a journal.
Thank you for including a funding statement. Authors are encouraged to include this statement when submitting to a journal.
No protocol registration statement was detected.

Results from scite Reference Check: We found no unreliable references.

Read the original source

Long-Read RNA Sequencing Identifies Polyadenylation Elongation and Differential Transcript Usage of Host Transcripts During SARS-CoV-2 In Vitro Infection

This article has been Reviewed by the following groups

Discuss this preprint

Listed in

Abstract

Article activity feed

Differential expression of two microRNAs, miR-29a-3p and miR-15b-5p, in SARS-CoV-2 infection in the immunocompromised and immunocompetent host

Human Cytomegalovirus Strain Specific Differences in Protein Expression of Type I IFN Pathway Proteins Do Not Impact Virus Replication.

Poxvirus Replication Remodels Host m⁶A Epitranscriptome to Advance Infection

This article has been Reviewed by the following groups

Discuss this preprint

Listed in

Abstract

Article activity feed

Related articles

Differential expression of two microRNAs, miR-29a-3p and miR-15b-5p, in SARS-CoV-2 infection in the immunocompromised and immunocompetent host

Human Cytomegalovirus Strain Specific Differences in Protein Expression of Type I IFN Pathway Proteins Do Not Impact Virus Replication.

Poxvirus Replication Remodels Host m⁶A Epitranscriptome to Advance Infection