Predicting factors associated with under-5 mortality in India using machine learning algorithms: evidence from National Family Health Survey, 2019-21

Read the full article See related articles

Listed in

This article is not in any list yet, why not save it to one of your lists.
Log in to save this article

Abstract

Background: Reducing the under-5 mortality rate is high on the list of priorities in the global development agenda. The SDG targets a reduction in child mortality rate to less than 25 deaths per 1,000 live births annually by 2030. Though enormous gains have been observed for under-five mortality and child health over the past few years, it remains a significant public health challenge for India. Much earlier studies were made on the under-five mortality rates for children. Except in this study, the most widely used approach undertaken so far, to a great extent, relies on conventional regression analyses that are inherently known to have minimal predictive ability. This study attempts to develop a predictive model based on the advanced techniques of machine learning (AML) that could infer a rather accurate prediction for the under-5 mortality rate of India, namely U5MR. Methods: This study used the nationally representative microdata from 7 the National Family Health Survey's fifth version (NFHS-5, 2019-21). Multiple imputation methods, such as filling by modal value, were adopted to treat missing values. We used a feature selection method known as information gain, where we ranked the information-rich features and examined their impact on the prediction of child mortality. The synthetic minority over-sampling method (SMOTE) was used to balance the dataset. To predict the determinants affecting U5MR, we used four machine learning (ML) models (decision trees, logistic regression, support vector machines, and K-nearest neighbors). The predictive power of each ML model was assessed using accuracy, precision, receiver operating characteristic curves, and model accuracy metrics (accuracy, precision, F1 score, ROC). Results: The descriptive findings demonstrate that India's under-five mortality rates vary significantly by region. The Decision Tree model (96.35%) performed the best out of all the models examined, with the under-five mortality prediction ability ranging from 90% to 96.35%. The best predictive model demonstrates that factors influencing under-five mortality rates in India include duration of breastfeeding status, Marriage to the first birth Interval, education, Birth order, ANC visit, wealth index, place of delivery, and place of residence to estimate under-five mortality risk variables, The Decision Tree machine learning model would engender a greater predictive capability; therefore, it would help in better policy decision-making in this field. Employment of these major elements would greatly enhance a child's chances of survival to be directed by appropriate policies. Conclusion : The decision tree model will perform better than traditional logistic regression. Prediction of under-5 mortality rate at the India level revealed a success accuracy of 96.35 percent and precision of about 60 percent.

Article activity feed