Decoding Helicobacter pylori Resistance: Machine Learning–Enhanced Prediction of Antibiotic Susceptibility using Whole-Genome Sequencing
Listed in
This article is not in any list yet, why not save it to one of your lists.Abstract
Background
Helicobacter pylori is a significant risk factor for gastric cancer, peptic ulcers, and MALT lymphoma. Rising antibiotic resistance rates complicate treatment strategies. While nucleotide sequence based assays are reliable in predicting clarithromycin and levofloxacin resistance, predicting metronidazole resistance is more challenging due to diverse metabolic pathways contributing to resistance, and high genomic variability.
Methods
We assembled a cohort of 483 H. pylori clinical isolates, combining whole-genome sequencing with phenotypic susceptibility testing. Machine learning models (SVM, XGBoost, FNN) were trained on genomic variants to predict resistance phenotypes. A sliding-window approach and SHAP-based importance scoring were used for feature selection to identify biologically relevant mutations, improving prediction accuracy, particularly for metronidazole resistance.
Results
The best-performing FNN model improved metronidazole resistance prediction by 16% compared to conventional (non-ML, single polymorphisms) sequence-based detection methods applied to the same strain collection. Feature selection identified 32 feature sets, with 11 sets significantly improving F1-scores over the baseline. Combining 2–4 feature sets revealed 53 synergistic combinations across all models. Validation showed that 87% of these combinations significantly outperformed non-ML molecular testing, with 16 combinations achieving F1-scores above 0.65.
Conclusion
Machine-learning can significantly improve the performance of sequence-based susceptibility testing for metronidazole in H. pylori . Novel candidate predictive markers identified from whole-genome data offer testable hypotheses about yet unexplored mechanisms of metronidazole resistance. These findings support the potential for ML-based approaches to enable more accurate susceptibility-guided therapies.