Aird-MSI: a high compression rate and decompression speed format for mass spectrometry imaging data
Listed in
This article is not in any list yet, why not save it to one of your lists.Abstract
Mass spectrometry imaging has emerged as a pivotal tool in spatial metabolomics, yet its reliance on the imzML format poses critical challenges in data storage, transmission, and computational efficiency. While imzML ensures cross-platform compatibility, its lower compressed binary architecture results in large file sizes and high parsing overhead, hindering cloud-based analysis and real-time visualization.
This study introduces an enhanced Aird compression format optimized for spatial metabolomics through two innovations: (1) a dynamic combinatorial compression algorithm for integer-based encoding of m/z and intensity data; (2) a coordinate-separation storage strategy for rapid spatial indexing. Experimental validation on 47 public datasets demonstrated significant performance gains. Compared to imzML, Aird achieved a 70% reduction in storage footprint (mean compression ratio: 30.03%) while maintaining near-lossless data precision (F1-score = 99.26% at 0.1 ppm m/z tolerance). For high-precision-controlled datasets, Aird accelerated loading speeds by 15-fold in MZmine.
The Aird format overcomes crucial bottlenecks in spatial metabolomics by harmonizing storage efficiency, computational speed, and analytical precision, reducing I/O latency for large cohorts. By achieving near-native feature detection accuracy, Aird establishes a robust infrastructure for translational applications, including disease biomarker discovery and pharmacokinetic imaging.