Predicting the Level of Anemia among Ethiopian Neonatal Using Ensemble Machine Learning Algorithms
Discuss this preprint
Start a discussion What are Sciety discussions?Listed in
This article is not in any list yet, why not save it to one of your lists.Abstract
Anemia occurs when the body's physiological demands are not met by the quantity of red blood cells, reducing their oxygen-carrying capacity. According to the World Health Organization, neonatal anemia is classified as mild (Hb 10–13.9 g/dL), moderate (Hb 7–9.9 g/dL), normal (Hb 14–24 g/dL), and severe (Hb < 7 g/dL), and is influenced by socioeconomic and demographic factors. Previous studies did not generate actionable rules for policymakers, design or deploy artifacts, or construct multi-class predictive models for neonatal women's anemia based on these factors using machine learning. This study develops a predictive model and prototype for neonatal women's anemia level using data from the Ethiopian Demographic Health Survey (2005–2016). Data preprocessing ensured high-quality input suitable for machine learning. Following a design science research strategy, four experiments were conducted on 42,376 instances with 22 features, splitting data into training and testing sets (80/20) and applying Random Forest, XGBoost, Decision Tree, and CatBoost algorithms, achieving accuracies of 98.37%, 98.22%, 97.92%, and 74.43%, respectively. Random Forest was selected as the best algorithm based on objective and subjective evaluations. Feature importance analysis identified key determinants: age, region, contraceptive use and intention, body mass index, diarrhea, respondent weight, husband’s occupation, cooking fuel type, literacy, wealth index, women’s education, and place of residence. These factors guided artifact design using a Flask framework and deployment on PythonAnywhere cloud platform. Subjective evaluation indicated 88% user acceptance. The model is not integrated with a knowledge-based system; future research should integrate it to develop an intelligent system for automated prediction of neonatal women’s anemia levels.