Validation of diverse and previously untraceable Sendai virus copyback viral genomes by direct RNA sequencing

Sarah E. Pye
Emna Achouri
Yanling Yang
Abdulafiz Musa
Carolina B. López

Read the full article

Discuss this preprint

Start a discussion What are Sciety discussions?

Listed in

This article is not in any list yet, why not save it to one of your lists.

Log in to save this article

Abstract

Copyback viral genomes (cbVGs) are truncated viral genomes with complementary ends produced when the negative-sense RNA virus polymerase detaches from the replication template and resumes elongation from the nascent strand. Despite advances in methods to identify cbVGs based on the site of polymerase break and rejoin, PCR-based tools cannot provide full-length sequences of most cbVGs and/or can introduce errors and artifacts during cbVG amplification. These limitations have painted an incomplete picture of the diverse population of cbVGs generated during infection. To improve our ability to obtain native full-length sequences of cbVGs, we optimized direct RNA sequencing (DRS) as a fast and simple tool to sequence full-length cbVGs and harnessed a BLAST-based analysis approach to identify cbVGs from long-read sequencing data. We analyzed the DRS outputs of multiple Sendai virus (SeV) stocks to highlight both the utility and limitations of this tool. We found that to capture the dominant 546 nt cbVG produced by SeV strain Cantell, the length of complementarity between the virus trailer and the DRS oligonucleotide should optimally be increased to up to 32 nt. We also demonstrate comparable quality of cbVG sequences by DRS from as little RNA as 17.6 ng from the media fraction or 50 ng from the cellular fraction of cells infected with SeV, in contrast to the recommended 1,000 ng. Importantly, we validated different cbVG species from two recombinant SeV stocks, including cbVGs whose break positions occurred at or near position one in the reference genome.

IMPORTANCE

Most viruses of the order Mononegavirales have been demonstrated to naturally generate copyback viral genomes. These genomes are critical determinants of infection outcomes; they interfere with standard virus replication by competing for viral resources, activate antiviral responses, and inhibit protein translation. Despite their critical roles in infection, current tools to study copyback viral genomes rely either on preexisting knowledge of the sequence of a target RNA or require reverse transcription and amplification of the target RNA, biasing toward short copyback genomes and introducing relatively high rates of errors. Here, we detail the optimization of direct RNA sequencing to validate native full-length copyback viral genomes, including species that have not been validated previously.

Version published to 10.1128/jvi.00894-25
Aug 19, 2025
Version published to 10.1101/2025.04.04.647164 on bioRxiv
Apr 4, 2025