Examining intra-host genetic variation of RSV by short read high-throughput sequencing

Every viral infection entails an evolving population of viral genomes. High-throughput sequencing technologies can be used to characterize such populations, but to date there are few published examples of such work. In addition, mixed sequencing data are sometimes used to infer properties of infecting genomes without discriminating between genome-derived reads and reads from the much more abundant, in the case of a typical active viral infection, transcripts. Here we apply capture probe-based short read high-throughput sequencing to nasal wash samples taken from a previously described group of adult hematopoietic cell transplant (HCT) recipients naturally infected with respiratory syncytial virus (RSV). We separately analyzed reads from genomes and transcripts for the levels and distribution of genetic variation by calculating per position Shannon entropies. Our analysis reveals a low level of genetic variation within the RSV infections analyzed here, but with interesting differences between genomes and transcripts in 1) average per sample Shannon entropies; 2) the genomic distribution of variation ‘hotspots’; and 3) the genomic distribution of hotspots encoding alternative amino acids. In all, our results suggest the importance of separately analyzing reads from genomes and transcripts when interpreting high-throughput sequencing data for insight into intra-host viral genome replication, expression, and evolution.

