De novo assembly of plasmodium interspersed repeat (pir) genes from Plasmodium vivax RNAseq data suggests geographic conservation of sub-family transcription
Listed in
This article is not in any list yet, why not save it to one of your lists.Abstract
Background : The plasmodium interspersed repeats ( pir ) multigene family is found across malaria parasite genomes, first discovered in the human-infecting species Plasmodium vivax , where they were initially named the vir s. Their function remains unknown, although studies have suggested a role in virulence of the asexual blood stages. Sub-families of the P. vivax pir/vir s have been identified, and are found in isolates from across the world, however their transcription at different localities and in different stages of the life cycle have not been quantified. Multiple transcriptomic studies of the parasite have been conducted, but many map the pir reads to existing reference genomes (as part of standard bioinformatic practice), which may miss members of the multigene family due to its inherent variability. This obscures our understanding of how the pir sub-families in P. vivax may be contributing to human/vector infection. Results: To overcome the issue of hidden pir diversity from utilising a reference genome, we employed de novo transcriptome assembly to construct the pir ‘reference’ of different parasite isolates from published and novel RNAseq datasets. For this purpose, a pipeline was written in Nextflow, and first tested on data from the rodent-infecting P. c. chabaudi parasite to ascertain its efficacy on a sample with a full, genome-based set of pir gene sequences. The pipeline assembled hundreds of pir s from the studies included. By performing BLAST sequence identity comparisons with reference genome pir s (including P. vivax and related species) we found a clustered network of transcripts which corresponded well with prior sub-family annotations, albeit requiring some updated nomenclature. Mapping the RNAseq datasets to the de novo transcriptome references revealed that the transcription of these updated pir gene sub-families is generally consistent across the different geographical regions. From this transcriptional quantification, a time course of mosquito bloodmeals (after feeding on an infected patient) highlighted the first evidence of ookinete stage pir transcription in a human-infective malaria parasite. Conclusions: De novo transcriptome assembly is a valuable tool for understanding highly variable multigene families from Plasmodium spp ., and with pipeline software these can be applied more easily and at scale. Despite a global distribution, P. vivax has a conserved pir sub-family structure - both in terms of genome copy number and transcription. We suggest that this indicates important roles of the distinct sub-families, or a genetic mechanism maintaining their preservation. Furthermore, a burst of pir transcription in the mosquito stages of development is the first glint of ookinete pir expression for a human-infective malaria parasite, suggesting a role for the gene family at a new stage of the lifecycle.