plasmoRUtils: A one-stop R Package for Plasmodium and other Apicomplexan parasite-related Bioinformatics analysis
Listed in
This article is not in any list yet, why not save it to one of your lists.Abstract
Bioinformatics analysis of non-model organisms remains challenging due to the limited availability of specialized tools, as most R packages are optimized for well-annotated model species. This problem is exacerbated by genomic and proteomic data being scattered across multiple databases, each employing different identifiers based on varying reference annotations. Comprehensive databases have been developed to disseminate knowledge related to Apicomplexan genomics and proteomics, such as VEupathDB. Several specialised databases, particularly for the malaria parasite Plasmodium, have been developed such as ApicoTFDB, malaria.tools, MPMP, MIIP, Phenoplasm, PlasmoBase, alongside broader resources like HitPredict and TED. However, these platforms often suffer from manual query interfaces, outdated identifiers, and inefficient data retrieval methods, complicating their use for large-scale bioinformatic analyses. To address these limitations, we present plasmoRUtils, an R package designed to streamline database access and data harmonization for Apicomplexan research. plasmoRUtils enables the retrieval of data tables using single-line R functions and standardized Ensembl gene IDs as input. Thanks to the APIs available for some databases such as VEuPathDB, the package also provides functions to build CLI-based queries for VEuPathDBs component databases. Additional support includes performing overrepresentation analyses and estimating parasite transcriptomic age/stage using single-cell or bulk RNA-Seq references. By automating data retrieval and transformation within RStudio, plasmoRUtils eliminates the need for manual database queries, facilitating end-to-end workflow development without leaving the R environment. Available on GitHub https://github.com/Rohit-Satyam/plasmoRUtils, with comprehensive documentation, plasmoRUtils represents a critical step toward efficient and reproducible bioinformatics for Apicomplexan research.