SARS-CoV-2-Host Chimeric RNA-Sequencing Reads Do Not Necessarily Arise From Virus Integration Into the Host DNA
This article has been Reviewed by the following groups
Listed in
- Evaluated articles (ScreenIT)
Abstract
The human genome bears evidence of extensive invasion by retroviruses and other retroelements, as well as by diverse RNA and DNA viruses. High frequency of somatic integration of the RNA virus severe acute respiratory syndrome coronavirus 2 (SARS-CoV-2) into the DNA of infected cells was recently suggested, based on a number of observations. One key observation was the presence of chimeric RNA-sequencing (RNA-seq) reads between SARS-CoV-2 RNA and RNA transcribed from human host DNA. Here, we examined the possible origin specifically of human-SARS-CoV-2 chimeric reads in RNA-seq libraries and provide alternative explanations for their origin. Chimeric reads were frequently detected also between SARS-CoV-2 RNA and RNA transcribed from mitochondrial DNA or episomal adenoviral DNA present in transfected cell lines, which was unlikely the result of SARS-CoV-2 integration. Furthermore, chimeric reads between SARS-CoV-2 RNA and RNA transcribed from nuclear DNA were highly enriched for host exonic, rather than intronic or intergenic sequences and often involved the same, highly expressed host genes. Although these findings do not rule out SARS-CoV-2 somatic integration, they nevertheless suggest that human-SARS-CoV-2 chimeric reads found in RNA-seq data may arise during library preparation and do not necessarily signify SARS-CoV-2 reverse transcription, integration in to host DNA and further transcription.
Article activity feed
-
-
SciScore for 10.1101/2021.03.05.434119: (What is this?)
Please note, not all rigor criteria are appropriate for all manuscripts.
Table 1: Rigor
Institutional Review Board Statement not detected. Randomization not detected. Blinding not detected. Power Analysis not detected. Sex as a biological variable not detected. Table 2: Resources
Software and Algorithms Sentences Resources RNA-seq analysis: Public RNA-seq datasets (Blanco-Melo et al., 2020) under the accession number GSE147507 were downloaded from NCBI Gene Expression Omnibus (GEO) server. Gene Expression Omnibussuggested: (Gene Expression Omnibus (GEO, RRID:SCR_005012)Adapter and quality trimming were conducted using Trimmomatic v0.36 (Bolger et al., 2014). Trimmomaticsuggested: (Trimmomatic, RRID:SCR_011848)Quality of sequencing reads was assessed by FastQC v0.11.5. FastQCsuggested: (FastQC, RRID:SC…SciScore for 10.1101/2021.03.05.434119: (What is this?)
Please note, not all rigor criteria are appropriate for all manuscripts.
Table 1: Rigor
Institutional Review Board Statement not detected. Randomization not detected. Blinding not detected. Power Analysis not detected. Sex as a biological variable not detected. Table 2: Resources
Software and Algorithms Sentences Resources RNA-seq analysis: Public RNA-seq datasets (Blanco-Melo et al., 2020) under the accession number GSE147507 were downloaded from NCBI Gene Expression Omnibus (GEO) server. Gene Expression Omnibussuggested: (Gene Expression Omnibus (GEO, RRID:SCR_005012)Adapter and quality trimming were conducted using Trimmomatic v0.36 (Bolger et al., 2014). Trimmomaticsuggested: (Trimmomatic, RRID:SCR_011848)Quality of sequencing reads was assessed by FastQC v0.11.5. FastQCsuggested: (FastQC, RRID:SCR_014583)The resulted reads were aligned to the merged GRCh38/hg38 genome (including alternative and random chromosome sequences) and SARS-CoV-2 NC_045512v2 genome using STAR v2.7.1 aligner (Dobin et al., 2013). STARsuggested: (STAR, RRID:SCR_015899)GENCODE v29 basic version and wihCor1 NCBI genes (http://hgdownload.soe.ucsc.edu/goldenPath/wuhCor1/bigZips/genes/) were used for human and SARS-CoV-2 gene annotations respectively. GENCODEsuggested: (GENCODE, RRID:SCR_014966)Gene expression was calculated by FeatureCounts (part of the Subread package v1.5.0) (Liao et al., 2014) and normalized with DESeq2 v1.22.1 within R v3.5.1 (Love et al., 2014). FeatureCountssuggested: (featureCounts, RRID:SCR_012919)Subreadsuggested: (Subread, RRID:SCR_009803)DESeq2suggested: (DESeq, RRID:SCR_000154)BLASTN+ v2.3.0 was used to align mtRNA-nRNA chimeric reads to identify mitochondrial and nuclear aligning sequences within the reads (Camacho et al., 2009). BLASTN+suggested: NoneResults from OddPub: We did not detect open data. We also did not detect open code. Researchers are encouraged to share open data when possible (see Nature blog).
Results from LimitationRecognizer: An explicit section about the limitations of the techniques employed in this study was not found. We encourage authors to address study limitations.Results from TrialIdentifier: No clinical trial numbers were referenced.
Results from Barzooka: We did not find any issues relating to the usage of bar graphs.
Results from JetFighter: We did not find any issues relating to colormaps.
Results from rtransparent:- Thank you for including a conflict of interest statement. Authors are encouraged to include this statement when submitting to a journal.
- Thank you for including a funding statement. Authors are encouraged to include this statement when submitting to a journal.
- No protocol registration statement was detected.
-