A Study on the Identification of Risk Factors for Latent Tuberculosis Infection in Xinjiang Using Machine Learning
Listed in
This article is not in any list yet, why not save it to one of your lists.Abstract
Background Latent tuberculosis infection (LTBI) is a significant reservoir foractive tuberculosis (TB) development. Identifying key risk factors for LTBI is crucialfor effective prevention and control strategies. Machine learning (ML) techniques can uncover complex relationships between risk factors and disease outcomes. Methods Data were collected from the "Tuberculosis Management Information System" in China. LTBI was defined by positive tuberculin skin tests. Four ML models—random forest, XGBoost, support vector machine, and neural network—were used for feature importance analysis, alongside LASSO and logistic regression to identify key risk factors. A risk nomogram was constructed based on selected variables. Results Key risk factors identified included age, body mass index (BMI), smoking status, occupational dust exposure, diabetes, and family history of TB. Logistic regression also highlighted medical insurance type, immunosuppressant use, education level, silicosis, anemia, mental health status, TB contact history, and insomnia. The risk nomogram showed good discrimination (AUC = 0.839). Conclusion This study identified several key risk factors for LTBI in a Chinese population using ML techniques. The developed risk nomogram can aid in targeted LTBI screening and prevention, emphasizing interventions like smoking cessation and occupational dust control to reduce LTBI and active TB disease burden.