A benchmarking workflow for assessing the reliability of batch correction methods

Elfried Salanon
Blandine Comte
Delphine Centeno
Stéphanie Durand
Estelle Pujos-Guillot
Julien Boccard

Read the full article

Listed in

This article is not in any list yet, why not save it to one of your lists.

Abstract

One of the most pervasive challenges in large-scale untargeted metabolomics is short and long-term analytical variability introducing the necessity of batch effect correction. In this context, several strategies and methods have been developed to limit those effects, either by monitoring the data generation process to maximize reproducibility or by applying post-analysis data correction. Different evaluation frameworks, either assessing the degree of bias in the data through visual tools or quantitative indicators, or evaluating the prediction performance of known biomarkers, were also proposed. However, there is currently no clear consensus on how to evaluate batch correction methods. This work offers a strategy to assess multiple dimensions of batch correction efficiency within a comprehensive and reliable framework, designed to assess the effectiveness and reliability of batch correction methods. Based on Mahalanobis Conformity Index (MCI), it provides a multivariate and covariance-aware metric to quantify within- and between-batch variability. Additionally, it combines visualization techniques (Principal Component Analysis (PCA) and Multivariate INTegrative (MINT) PCA) with numerical indicators (batch dispersion, Coefficient of Variation), supporting both multidimensional and metabolite-specific evaluations. Lastly, this novel approach integrates statistical tools alongside chemistry-based metrics for method overfitting and overcorrection assessment. Applied within a use case for comparing LOESS-based and ComBat correction methods, the present workflow provided a structured approach to systematically assess the reliability of batch corrections, ensuring both data intercomparability and biological relevance in metabolomics studies.

Author summary

The assessment of batch correction is a challenge in metabolomics, where there is no consensus for a define strategy, making it a complex task. However, knowing the impact of batch correction on the datasets and consecutive possible impact on downstream statistical analyses, providing a reliable framework for its assessment is a cornerstone for reproducible results. The objective of the present work was to provide a framework and a set of interpretation tools combining numerical indicators, as well as diagnostic plots, for assessing the reliability of batch correction methods. We introduced a robust evaluation framework centered on the Mahalanobis Conformity Index, providing a multivariate and covariance-aware metric to quantify within- and between-batch variability. By coupling this index with visual tools (based on Principal Component Analysis (PCA) and Multivariate INTegrative (MINT) PCA), as well as compound-level diagnostics, we enabled a fine-grained and interpretable comparison of correction strategies, highlighting their strengths and potential pitfalls.

Version published to 10.1101/2025.08.01.668073 on bioRxiv
Aug 2, 2025

Comparative evaluation of imputation and batch-effect correction for proteomics/peptidomics differential-expression analysis

This article has 5 authors:
1. Charis Gonidaki
2. Agnieszka Latosinska
3. Antonia Vlahou
4. Rafael Stroggilos
5. Harald Mischak
This article has no evaluationsLatest version Aug 16, 2025
The Consequences of Statistical Tests on Using Proxy Measurements in Place of Gold Standard Measurements: An Application to Magnetic Resonance Spectroscopy

This article has 3 authors:
1. Michael Treacy
2. Christoph Juchem
3. Karl Landheer
This article has no evaluationsLatest version Aug 16, 2025
Assumption-Agnostic Deep Learning Framework for Holistic Clinical Trial Monitoring

This article has 3 authors:
1. Shaoming Yin
2. Zheyang Wu
3. Jianchang Lin
This article has no evaluationsLatest version Aug 1, 2025

Listed in

Abstract

Author summary

Article activity feed

Related articles

Comparative evaluation of imputation and batch-effect correction for proteomics/peptidomics differential-expression analysis

The Consequences of Statistical Tests on Using Proxy Measurements in Place of Gold Standard Measurements: An Application to Magnetic Resonance Spectroscopy

Assumption-Agnostic Deep Learning Framework for Holistic Clinical Trial Monitoring