Predicting birth asphyxia in newborns via supervised machine learning: A prospective cohort study in Tigray, Ethiopia 2025

Read the full article See related articles

Discuss this preprint

Start a discussion What are Sciety discussions?

Listed in

This article is not in any list yet, why not save it to one of your lists.
Log in to save this article

Abstract

Background Birth asphyxia, a critical condition characterized by insufficient oxygen supply to a newborn before, during, or after birth, is the second leading cause of neonatal mortality in Ethiopia. It contributes substantially to preventable neonatal morbidity and long-term neurodevelopmental impairment. The burden is especially high in low-resource regions such as Tigray, where healthcare systems have been severely impacted by conflict and limited infrastructure. Supervised machine learning (ML) offers a powerful data-driven solution to support clinical decision-making to early and accurately predict newborns at-risk. Methods An institution-based prospective cohort study was conducted among 1014 mothers and their newborns who delivered at four selected hospitals in Tigray (Ayder, Mekelle, Quiha, and Wukro) between February 25 and April 10, 2025. A systematic random sampling technique was used to recruit eligible participants. The dataset underwent thorough preprocessing, including handling missing values, one-hot encoding, normalization, a hybrid feature selection approach, and class balancing. Seven ML models—logistic regression, support vector machine, decision tree, random forest (RF), Naive Bayes, k-nearest neighbors, and extreme gradient boosting were trained and evaluated. Model performance was assessed in terms of accuracy, sensitivity, specificity, F1-score, and area under the receiver operating characteristic curve (AUC) with 95% confidence intervals. Shapley additive explanations (SHAPs) were employed for model interpretability; and were validated across cross-validation folds. Results Of the 1014 neonates included, 195 (19.2%) were diagnosed with birth asphyxia on the basis of APGAR scores and physician confirmation. The random forest classifier achieved the best performance, with an AUC of 0.99 (95% CI: 0.98–1.00) and a Brier score of 0.0099 (95% CI: 0.008–0.012). SHAP analysis revealed that fetal heart rate (38.6%), birth weight (11.2%), malpresentation (8.1%), hypothermia (7.7%), referral status (7.5%), and prolonged labor (6.5%) collectively contributed to 79.6% of the model’s predictive capacity, which was consistent across folds (standard deviation of SHAP values < 0.02). Conclusion The RF model demonstrated excellent performance in predicting birth asphyxia and offered strong interpretability. Nearly 80% of the model's predictive power was explained by a small number of clinically actionable variables. These findings support the integration of interpretable machine learning tools into routine labor management to reduce birth asphyxia.

Article activity feed