Risk-Based Prediction of Novel AMR Variants Using Protein Language Models
Listed in
This article is not in any list yet, why not save it to one of your lists.Abstract
Antimicrobial resistance (AMR) is among the most pressing global health threats of the 21st century, with the potential to thrust modern medicine back into a pre-antibiotic era. Resistance can arise through diverse mechanisms, including genomic mutations that prevent antibiotics from reaching or acting on their targets. To limit the spread of AMR, surveillance systems must detect both known and emerging resistance markers. Here we present AMRscope, a model trained on ESM2 protein language model embeddings of single mutations for prediction of resistance likelihood, combined with a rigorous evaluation framework. This tool is applied across antibiotic-interacting proteins of different bacterial species, including WHO priority pathogens, such as rifampicin-resistant M. tuberculosis and carbapenem-resistant P. Aeruginosa . Performance on random splits achieves a competitive accuracy, F1 and MCC of 0.88, 0.87 and 0.75, respectively, while additional splitting strategies demonstrate transfer of predictive power to unseen organisms or genes. Moreover, in silico deep mutational scanning and structural mapping across these targets reveals the tool can recover known resistance-associated regions and highlight new candidates. The risk-based outputs complement database matching and resistance element detection tools, providing clinicians and public health agencies with an interpretable and scalable system for AMR surveillance and proactive response.