Scalable machine learning improves resistance prediction and identifies novel determinants in Mycobacterium tuberculosis
Discuss this preprint
Start a discussion What are Sciety discussions?Listed in
This article is not in any list yet, why not save it to one of your lists.Abstract
Multidrug-resistant and extensively drug-resistant Mycobacterium tuberculosis (MTB) represents a growing global health crisis, characterized by limited treatment options and high mortality rates. Rapid and accurate prediction of resistance profiles is critical to guide effective therapy and curb transmission. Whole-genome sequencing (WGS) offers promise for individualized resistance profiling, yet existing computational tools remain constrained by predefined mutation catalogs and prohibitive resource requirements for large-scale analyses. Here, we present AURA, a GPU-accelerated, pangenome-scale machine learning framework for de novo resistance prediction. Trained on 12,185 globally diverse MTB isolates, AURA predicts resistance to 13 first-line, second-line, and repurposed antibiotics with high precision and identifies 59 novel resistance-associated loci, including variants in katG, pncA, rpoC , and members of the PE/PGRS gene family. By enabling model training on an unprecedented genomic scale, AURA provides new insights into the genetic architecture of resistance and establishes a scalable platform for precision-guided therapy and global surveillance of MTB.