Early detection of ampicillin susceptibility in Enterococcus faecium with MALDI-TOF MS and machine learning
Listed in
This article is not in any list yet, why not save it to one of your lists.Abstract
Background
Enterococcus faecium can cause severe infections and is often resistant to the first-line antibiotic ampicillin. Consequently, clinicians usually prescribe broad-spectrum antibiotics, promoting the selection of multidrug-resistant bacteria. In this study, we investigate the application of machine learning techniques to detect ampicillin susceptibility directly from MALDI-TOF mass spectrometry. This technique could enable an earlier optimised treatment in infections with ampicillin-susceptible E. faecium .
Methods
Two datasets of clinical E. faecium MALDI-TOF spectra and their resistance phenotype were analysed: our own Technical University of Munich (TUM) dataset and the publicly available MS-UMG dataset. We tested logistic regression and LightGBM models on each datasets via nested cross-validation and explored transferability on the respective other dataset.
Results
LightGBM demonstrated slightly better performance than logistic regression in identifying susceptible isolates in the TUM dataset (area under the precision-recall curve (AUPRC) 0.907 ± 0.016 vs 0.902 ± 0.030) as well as in the MS-UMG dataset (AUPRC 0.902 ± 0.029 vs 0.899 ± 0.054). External validation demonstrated good model transferability (AUPRC of 0.784 ± 0.039 when trained on MS-UMG; 0.804 ± 0.013 when trained on TUM). SHAP analysis consistently identified a top-ranked spectral feature corresponding to a peak at an m/z of 5091.66 in resistant isolate spectra.
Conclusion
This study demonstrates that logistic regression and LightGBM models can identify ampicillin-susceptible E. faecium isolates from MALDI-TOF spectra and generalise well to unseen datasets. While clinical implementation would currently still require confirmatory testing, the addition of larger datasets in the near future will support the development of more robust machine learning models.