Genomic and Machine Learning Approaches for Predicting Antimicrobial Resistance: A One Health Scoping Review in Low- and Middle-Income Countries

Read the full article

Discuss this preprint

Start a discussion What are Sciety discussions?

Listed in

This article is not in any list yet, why not save it to one of your lists.
Log in to save this article

Abstract

Background Antimicrobial resistance (AMR) is a major global health threat affecting humans, animals, and the environment. Its burden is particularly severe in low- and middle-income countries (LMICs), where infectious diseases are prevalent, antimicrobial use is often unregulated, and laboratory capacity is limited. Conventional phenotypic testing is essential but time-consuming and may fail to detect emerging or complex resistance mechanisms. Objective This scoping review maps genomic surveillance approaches combined with machine learning (ML) and artificial intelligence (AI) for predicting bacterial AMR, emphasizing methodological strategies, predictive performance, and relevance to LMIC One Health surveillance. Methods Systematic searches of PubMed, Scopus, Web of Science, EMBASE, and preprint servers (2020–2026) identified studies applying whole-genome sequencing (WGS) or pan-genomic approaches with ML/AI to predict AMR phenotypes. Data were extracted on pathogens, genomic feature engineering, ML/AI models, validation strategies, performance metrics, sample sources (human, animal, environmental), and LMIC relevance. Results Twenty-seven studies met inclusion criteria, including 22 with direct or transferable LMIC relevance. Predictive performance ranged from 78% to 98%, with LMIC datasets achieving 80%–94% accuracy. Tree-based ensembles (Random Forest, gradient boosting), logistic regression, and neural networks predominated. Genomic features included single nucleotide polymorphisms, k-mer encodings, and pan-genome presence–absence matrices. Explainable AI methods, such as SHAP, improved interpretability. Most models were trained on high-income country datasets, and integrated LMIC datasets spanning human, animal, and environmental reservoirs remain limited. Conclusions Genomic ML/AI approaches offer a rapid, high-resolution pathway for AMR prediction and One Health surveillance. Expanding LMIC-specific datasets, improving external validation, and integrating explainable AI are critical for equitable and sustainable deployment.

Article activity feed