Machine learning models for predicting childhood anemia in Mozambique: analysis from national survey data
Discuss this preprint
Start a discussion What are Sciety discussions?Listed in
This article is not in any list yet, why not save it to one of your lists.Abstract
Background Childhood anemia remains a major public health concern in sub-Saharan Africa, with Mozambique among the most affected countries. Despite the growing use of machine learning (ML) to enhance disease prediction, there is a lack of national-level evidence on its application to childhood anemia in low-resource settings. This study aimed to develop, compare, and interpret ML models to predict anemia among children under five years of age in Mozambique using nationally representative survey data. Methods Data extracted from the 2022–2023 Mozambique Demographic and Health Survey (MDHS). Children under five were included, with anemia defined as hemoglobin < 11.0 g/dL. Five ML models were developed and validated, comprising Logistic Regression, Random Forest, XGBoost, LightGBM, and CatBoost. The predictive capacity of each model was assessed using AUC-ROC and slope calibration and SHAP analysis for interpretability. Results Among the 1,638 children analyzed, the prevalence of anemia was high (40.3%). XGBoost demonstrated the best discrimination (AUC-ROC = 0.722), with a sensitivity of 57.6% and a specificity of 93.2%. The SHAP analysis identified child’s age in months, lack of vitamin A supplementation, low household wealth, maternal education, number of children, and recent diarrhea as the strongest predictors. Conclusion In general, the essemble boosting models showed the highest discriminatory capacity, with XGBoost having the highest, with the potential of interpretable, low-cost predictive models to support early screening and targeted interventions for childhood anemia. Future work should explore regional retraining and recalibration, fairness evaluation, transfer learning, and external validation to enhance generalizability and field applicability.