Validation and Improvement of Child Undernutrition Risk Prediction Model Using National Family Health Survey - 5 Data from India
Discuss this preprint
Start a discussion What are Sciety discussions?Listed in
This article is not in any list yet, why not save it to one of your lists.Abstract
More than a third of India’s children suffer from undernutrition, and despite nationwide efforts, India is projected to miss the Sustainable Development Goals targets for 2030. Worse, socioeconomic disparities have widened for these metrics. A data-driven prioritization effort may hold promise for more efficient and effective interventions. Data from nationwide assessment (NFHS-5) was used to evaluate performance of a predictive model previously developed by our team using NFHS-4 data. Model performance was assessed for discrimination (AUC) and calibration. Enhancements were explored using LASSO feature selection to refine predictors. A new model (Enhanced Model) was trained on NFHS-5 data using logistic regression, XGBoost, and neural networks. 51% of children in the analytical sample were undernourished as per the CIAF definition. Model A, trained on NFHS-4 data, showed moderate discrimination (AUC: 66.6%), high sensitivity (93.76%), and low specificity (21.58%), leading to many false positives and risk overestimation. The enhanced model achieved a slightly improved AUC (67.5%), accuracy (62.97%), sensitivity (66.31%) and specificity (58.82%). The updated predictive model trained on NFHS-5 data achieved modest improvements in accuracy and balanced sensitivity and specificity compared to the previously developed logistic regression model. Although machine learning methods are promising, the improvements over frequentist models remain incremental for the prognostic capability of single timepoint predictors.