Predicting Under Five Mortality in Bihar Through Machine Learning and SDG Metrics

Muskaan Gupta
Sacheendra Shukla
Niraj Kumar Singh

Read the full article

Discuss this preprint

Start a discussion What are Sciety discussions?

Listed in

This article is not in any list yet, why not save it to one of your lists.

Abstract

Child mortality is a vital indicator of a nation’s health and development, closely aligned with the Sustainable Development Goals (SDGs). This study investigates the determinants of under-five mortality in Bihar, India, utilizing data from the National Family Health Survey (NFHS-5, 2019–21). A total of 21,040 records of children born to married women were analyzed using 33 predictor variables selected based on their relevance to SDG targets. The research employs a comparative machine learning approach, evaluating the predictive performance of Logistic Regression, Random Forest, K-Nearest Neighbors (KNN), Naıve Bayes, and Support Vector Machine (SVM) models. The results reveal that Random Forest and Naıve Bayes models achieved the highest accuracy (98.80% and 98.67%, respectively), with Naıve Bayes attaining perfect recall (100%) and an F1 score of 99.53%, while Random Forest achieved an F1 scoreof 98.73%. Logistic Regression showed moderate performance with 76.61% accuracy, 74.92% precision, and an F1 score of 76.31%. K-Nearest Neighbors(KNN) achieved 84.12% accuracy and 88.56% precision, but had a lower recallof 75.24%. The Support Vector Machine (SVM) model performed well with 86.38% accuracy and a balanced F1 score of 86.46%. AUC-ROC scores ranged from 85.64% (Logistic Regression) to 99.96% (Random Forest), indicating strong model discrimination across the board. These findings underscore the potential of machine learning in identifying key socio-demographic, economic, and health related factors influencing child survival. The study provides valuable insights for policymakers aiming to reduce child mortality and achieve SDG targets inBihar.

Version published to 10.21203/rs.3.rs-7176124/v1 on Research Square
Aug 7, 2025

Comparing Algorithm Effectiveness in Health Data Analysis

This article has 1 author:
1. Abdulmalik Hazaa Alshammari
This article has no evaluationsLatest version Jan 22, 2026
The power of machine learning models in predicting gestational diabetes mellitus

This article has 7 authors:
1. Vahid Mehrnoush
2. Ali Haghighat
3. Anna Nami
4. Nazanin Rezaei
5. Fatemeh Darsareh
6. Farideh Montazeri
7. Mozhgan Saffari
This article has no evaluationsLatest version Dec 17, 2025
A Machine Learning Approach for Identifying and Predicting Risk Factors Related to Low Birth Weight in Newborn Children in Bangladesh

This article has 7 authors:
1. Samrat Kumar Dev Sharma
2. Md. Yusuf Hossain Ador
3. Md. Rukonuzzaman
4. Futanta Chakma
5. Mahmud Hossen
6. Jakir Hossain
7. Md. Kamruzzaman
This article has no evaluationsLatest version Dec 15, 2025

Discuss this preprint

Listed in

Abstract

Article activity feed

Related articles

Comparing Algorithm Effectiveness in Health Data Analysis

The power of machine learning models in predicting gestational diabetes mellitus

A Machine Learning Approach for Identifying and Predicting Risk Factors Related to Low Birth Weight in Newborn Children in Bangladesh