Prediction of TB treatment outcomes among HIV/TB coinfected patients in Uganda using routinely collected clinical data: a machine-learning approach
Discuss this preprint
Start a discussion What are Sciety discussions?Listed in
This article is not in any list yet, why not save it to one of your lists.Abstract
Background Tuberculosis (TB) remains a major public health challenge, particularly in settings with high HIV prevalence. In Uganda, TB/HIV co-infection contributes to suboptimal treatment success rates (TSR), still below the World Health Organisation (WHO) target of ≥ 90%. Early identification of patients at risk of poor outcomes is essential to mitigate poor adherence, reduce the risk of multidrug-resistant TB (MDR-TB), and improve health outcomes, including reduced morbidity and mortality as well as transmission. This study applied machine learning (ML) techniques to predict TB treatment outcomes among HIV/TB co-infected patients in Uganda and to identify the most influential predictors of treatment failure using feature importance and partial dependence analyses. Methods Data from a retrospective cohort of 5,062 HIV/TB co-infected patients treated in Uganda between 2020 and 2024were analysed. Machine learning models, including logistic regression, Random Forest, Gradient Boosting Machine (GBM), Support Vector Machine (SVM), AdaBoost, and a stacked ensemble, were developed and evaluated using discrimination, calibration, recall, and accuracy metrics. Feature importance and partial dependence analyses were used to interpret model predictions. Results The stacked ensemble and class-balanced logistic regression models achieved the best overall performance (AUC ≈ 0.67; accuracy ≈ 0.62). The Random Forest model exhibited the highest discrimination (ROC-AUC = 0.675), while GBM achieved the highest accuracy (0.768) but low sensitivity to treatment failures. Key predictors of treatment success included age, ART status, sex, marital status, TB classification, and treatment model. Treatment success declined progressively with increasing age, particularly beyond 40 years. Conclusions The models demonstrated moderate predictive performance and identified key demographic and programmatic predictors of TB treatment outcomes. While not suitable for autonomous clinical decision-making, these models may support risk stratification and targeted patient follow-up. Trial registration number: Not Applicable