High-Resolution Characterization of rAAV Genomes for Vector Quality Using Long-Read Sequencing
Discuss this preprint
Start a discussion What are Sciety discussions?Listed in
This article is not in any list yet, why not save it to one of your lists.Abstract
Recombinant adeno-associated virus (rAAV)-mediated gene therapy has been applied for human diseases. However, the rAAV capsids contain heterogeneous mixtures of full-length and truncated genomes and, depending on the manufacturing process, residual host cell and plasmid DNA. Therefore, a method is needed to characterize the encapsidated DNA of rAAV in order to support process development and batch release. The emerging long-read sequencing (LRS) has achieved AAV single-genome resolution. Here we propose a Python-based LRS profiling framework to classify and quantitate residual DNA species in rAAV products. We designed a reference that contains universal genetic components that are commonly used in rAAV production, including AmpR, KanR, Rep and Cap genes along with HPV18, Ad5 and hg38 genomes. We accessed the impurities of rAAV production from public and in-house LRS datasets. Analyzing the lambda fragments supplemented in these datasets showed that sequencing introduced size biases, which couldn’t be fully corrected by regression but is improvable within library preparation. Functional potential of impurities were assessed through indicators derived from long-read alignments, which enabled us to quantitatively compare impurities between manufacturing batches. We demonstrated that LRS provides informative metrics for rAAV production and can facilitate process development to ensure therapeutic product safety and quality.