MARISMa: a routine MALDI-TOF MS database from 2018 to 2024
Listed in
This article is not in any list yet, why not save it to one of your lists.Abstract
Clinical microbiology laboratories play a crucial role in identifying pathogens, guiding antibiotic treatment, and managing antimicrobial resistance (AMR). Matrix-Assisted Laser Desorption/Ionization Time-of-Flight Mass Spectrometry (MALDI-TOF MS) has become essential for rapid, accurate, and cost-effective microbial identification. Recent advances in integrating MALDI-TOF MS with Artificial Intelligence (AI) show promise in improving microbial detection and prediction of AMR. However, progress is limited by the lack of comprehensive and openly accessible databases that restrict the validation, reproducibility, and applicability of the model.
To address this gap, we introduce a publicly available, extensively documented MALDI-TOF MS database comprising 207,950 unique spectra from isolates collected between 2018 and 2024 at the Hospital General Universitario Gregorio Marañón, Spain. This dataset includes 191,048 bacteria, 16,537 fungal, and 365 mycobacterial isolates, all provided in raw format with detailed metadata and AMR annotations for 13,114 bacterial isolates. In contrast to many existing datasets, this resource is openly and freely shared, rigorously curated, and designed to support a wide range of machine learning and epidemiological applications. By ensuring unrestricted access to high-quality, standardized data, this database aims to promote transparency, reproducibility, comparative benchmarking, and collaborative progress in AI-driven clinical microbiology.