tadar : an R/Bioconductor package to reduce eQTL noise in differential expression analysis
Listed in
This article is not in any list yet, why not save it to one of your lists.Abstract
Despite the relative maturity of bulk RNA-sequencing, compared to more recent developments such as single-cell and spatial RNA-sequencing, biases which impact data analysis continue to surface. One such bias, termed “Differential Allelic Representation” (DAR), is particularly evident when experimental samples are taken from non-isogenic genetic backgrounds. DAR is an uneven distribution of polymorphic loci between groups of experimental samples undergoing differential gene expression (DGE) analysis. When unequally represented polymorphic loci are also expression quantitative trait loci (eQTLs), DAR can lead to differences in gene expression which are not directly relevant to the primary research objectives. To mitigate DAR in both new and existing datsets, we introduce tadar , a Bioconductor package designed to facilitate transcriptome analysis by accounting for differential allelic representation. tadar implements a methodology that calculates a DAR metric at each polymorphic locus across the genome, which then serves as a predictive measure of a locus’ potential to cause eQTL-driven expression differences. This metric is then used to reduce eQTL noise in bulk RNA-Sequencing data.