Predicting abnormal birth weight and identifying its determinants using machine learning in the Hararghe Health and Demographic Surveillance System, Ethiopia

Sington Abdeta
Ousmane Diop
Steve Cygu
Aboubacry Drame
Reinpeter Momanyi
Belayneh Endalamaw
Mulugeta Tadele
Yordanos Sintayehu
Miranda Barasa
Mirkuzie Woldie
Tsinuel Girma
Rosa Tsegaye
Agnes Kiragga
Merga Deresa
Bethlehem Adnew
Rawleigh Howe
Alemseged Abdisa

Read the full article

Discuss this preprint

Start a discussion What are Sciety discussions?

Listed in

This article is not in any list yet, why not save it to one of your lists.

Abstract

Background Birth weight is a reliable measure of intrauterine growth and an important predictor of the newborn’s survival, growth, and development. Globally, millions of babies (15.5% for low birth weight and 10% for macrosomia of live births) are born with abnormal birth weight, the majority of whom are in sub-Saharan Africa. Ethiopia is no exception. This study employed a machine learning method using data from the Hararghe health and Demographic Surveillance System, which produced a strong predictive performance in identifying the complex and non-linear relations of factors affecting the birth weight of the newborn. Methods The study developed predictive models for abnormal birth weight using data from the Hararghe Health and Demographic Surveillance System, which is a retrospective cross-sectional in nature, collected from 2015 to 2022. All singleton births were included, and those with missing birth weight data were excluded. Six machine learning models identified to be effective from the previous studies were built and compared to identify the best-performing model for abnormal birth weight prediction. Prior observational studies and expert opinion were used to select the candidate features for all models. Synthetic minority oversampling (SMOTE) was used to manage the imbalance in the dataset. The dataset was divided into training (80%) and testing (20%) subsets to ensure independent model evaluation. Hyper-parametric tuning was performed using grid search combined with 10-fold cross-validation to optimize model performance and reduce over-fitting. The area under the curve (AUROC), accuracy, precision, F1-score, and Kappa were determined. Feature importance analysis was done using Shapley Additive explanation (SHAP) values. Results The Descriptive analysis of 11,553 singleton births showed that 10.78% of the newborns had low birth weight (HBW) and 9.28% had high birth weight (HBW). The eXtreme Gradient Boosting (XGBoost) model performed best by achieving an AUC of 0.835, an accuracy of 0.72, a precision of 0.67, an F1-score of 0.63, a recall of 0.54, and a kappa of 0.52 for abnormal birth weight prediction. The feature importance analysis showed that the top predictors for the low birth weight (LBW) include maternal educational status, age at first delivery, and antenatal care (ANC) visit, while high birth weight (HBW) was strongly predicted by antenatal care (ANC) visit, maternal literacy status, age at first delivery, and maternal education. Conclusion Although using machine learning methods for the prediction of abnormal birth weights has yielded promising results that would have a significant public health impact, more research with comprehensive predictors, which are missing in the Health and Demographic Surveillance System (HDSS), is needed to draw a better conclusion.

Version published to 10.21203/rs.3.rs-9226614/v1 on Research Square
Mar 31, 2026

Perinatal Mortality Prediction and Risk Factor Identification Using Machine Learning on Recent Sub-Saharan African DHS Data Affiliations

This article has 8 authors:
1. Tadele Chekol Maru
2. Andualem Enyew
3. Makda Fekadie Tewelgne
4. Eliyas Addisu Taye
5. Agerie Mengistie Zeleke
6. Belayneh Jejaw Abate
7. Deresse Abebe Gebrehana
8. Azanaw Amare Muche
This article has no evaluationsLatest version Mar 30, 2026
Predicting Adequate Antenatal Care Utilization Among Pregnant Women in Kenya: A Comparative Machine Learning Study Using the Kenya Demographic and Health Survey

This article has 1 author:
1. Calvince Otieno Ngaji
This article has no evaluationsLatest version Mar 27, 2026
Association between hemoglobin trajectories during pregnancy and birth outcomes: a retrospective cohort study based on latent growth mixture modeling analysis

This article has 6 authors:
1. Xiaohong Yu
2. Hongli Nie
3. Wenzhen Yang
4. Jing Tang
5. Xuexue Huang
6. Yongmei Zhao
This article has no evaluationsLatest version Apr 8, 2026

Discuss this preprint

Listed in

Abstract

Article activity feed

Related articles

Perinatal Mortality Prediction and Risk Factor Identification Using Machine Learning on Recent Sub-Saharan African DHS Data Affiliations

Predicting Adequate Antenatal Care Utilization Among Pregnant Women in Kenya: A Comparative Machine Learning Study Using the Kenya Demographic and Health Survey

Association between hemoglobin trajectories during pregnancy and birth outcomes: a retrospective cohort study based on latent growth mixture modeling analysis