Max-INtensity Untargeted Transformation (MINUT) for Direct Chemometric Modeling of High-Resolution Mass Spectrometry Data
Discuss this preprint
Start a discussion What are Sciety discussions?Listed in
This article is not in any list yet, why not save it to one of your lists.Abstract
Conventional untargeted classification methods in chromatography-coupled high-resolution mass spectrometry (HRMS) rely on preprocessing steps that distort the data, introduce information loss, compromise data integrity, and limit chemical interpretability. To overcome these limitations, we developed MINUT (Max-INtensity Untargeted Transformation), a novel framework for processing chromatography-coupled HRMS data that uses a two-dimensional maximum intensity binning approach. This method preserves the full resolution of both mass-to-charge ratio (m/z) and retention time, while remaining agnostic to assumptions about the data. As a result, MINUT enables direct chemometric modeling with minimal preprocessing and without compromising analytical precision. We validated MINUT and demonstrated its versatility across biological, clinical, and food authenticity datasets, including two public biomedical benchmarks, i.e. lung cancer and COVID-19 plasma samples. The method consistently outperformed conventional metabolomic pipelines in classification accuracy and interpretability. It successfully recovered known biomarkers such as sphingosine in the case of COVID-19 samples, while also revealing novel discriminative compounds. MINUT is a robust, scalable, and generalizable tool for HRMS-based classification and biomarker identification that lowers analytical costs, improves reproducibility, and enables interpretable HRMS classification across disciplines. Therefore, our results have major implications across wide range of disciplines, including clinical diagnostics, environmental sciences, food analysis, toxicology, and biotechnology.