Predictive Models and Predictors of Under-5 Mortality Using Machine Learning Techniques: Evidence from the 2022 DHS

Read the full article See related articles

Discuss this preprint

Start a discussion What are Sciety discussions?

Listed in

This article is not in any list yet, why not save it to one of your lists.
Log in to save this article

Abstract

Background Under-five mortality (U5M) remains a critical public health challenge in Ghana, despite improvements in maternal and child health services. Understanding key determinants is essential to inform targeted interventions. This study examined socio-demographic, maternal, reproductive, and environmental predictors of U5M using advanced machine learning approaches. Methods A nationally representative dataset comprising 34,663 participants from Ghana was analysed using GDS. Predictive models included Logistic Regression, Random Forests, XG Boost, and Artificial Neural Networks (ANNs). Model performance was evaluated using accuracy, area under the receiver operating characteristic curve (AUROC), and other relevant metrics. Data was analysed with STATA 17 and R. Statistical significance was set at 0.05/ Results The overall U5M rate was 6.53% (2,262 deaths), equivalent to 65.3 deaths per 1,000 live births. Gestational age and parity emerged as the strongest predictors, with term and post-term births and high parity associated with increased mortality, while first-born children exhibited protective effects. ANC visits were associated with increased mortality, likely reflecting confounding by indication. Socioeconomic factors, including maternal education, household wealth, and rural residence, exerted a secondary predictive influence. ANN, XG Boost, and Logistic regression demonstrated superior predictive performance, capturing complex nonlinear relationships among risk factors. After analysing the important features, the best predictors were Random Forest, Logistic regression, and ANN. Regional disparities were pronounced, with the highest U5M in Oti, Northern, Savanna, and Western regions. Conclusions Gestational age and parity are pivotal determinants of U5M in Ghana, highlighting critical windows for intervention. While socio-economic factors remain relevant, clinical and reproductive variables dominate prediction. Machine learning models, particularly Random Forest, Logistic regression and ANN, offer robust tools for risk stratification, enabling targeted maternal-child health interventions and supporting efforts to achieve Sustainable Development Goal targets for child survival.

Article activity feed