BatchVaria: a variance-aware framework for evaluating batch correction in high-dimensional omics data
Discuss this preprint
Start a discussion What are Sciety discussions?Listed in
This article is not in any list yet, why not save it to one of your lists.Abstract
Summary
Batch effects and other unwanted technical sources of variation remain a persistent challenge in the integrative analysis of high-dimensional-omics data. Although established methods such as ComBat effectively mitigate batch-associated signal, their impact on biologically meaningful variation is frequently evaluated in an ad hoc and non-quantitative manner. This is particularly problematic in heterogeneous disease contexts, such as breast cancer transcriptomics, where technical and biological sources of variation may be partially confounded. We present BatchVaria , an R package that implements a variance-aware framework for batch correction and post-adjustment evaluation. BatchVaria integrates variance component modelling, batch adjustment, and systematic re-profiling within a unified analysis container, enabling iterative quantification and reassessment of technical and biological variance contributions while preserving analytical provenance. By supporting multiple variance profiling engines and structured storage of intermediate results, BatchVaria facilitates transparent and reproducible evaluation of batch correction strategies. We demonstrate the utility of BatchVaria using a publicly available breast cancer transcriptomic dataset with known covariate-driven structure, illustrating how iterative variance profiling can guide responsible batch correction without erosion of subtype-associated biological signal.