From Risk Factors to Predictive Modelling: Applying Machine Learning to Childhood Malaria Surveillance in Resource-Limited Settings

Joseph Opeolu Ashaolu
Taiwo S. Akanji
Victoria I. Ayansola
Olajumoke O. Olawale-Succes
Agbolade J. Sunday
Sylvain Y.M. Some

Read the full article

Discuss this preprint

Start a discussion What are Sciety discussions?

Listed in

This article is not in any list yet, why not save it to one of your lists.

Abstract

Background Malaria remains a concerning public health issue in sub-Saharan Africa, especially among children under five. Nigeria accounts for almost 30% of malaria-related child deaths globally despite control efforts. However, machine learning (ML) approaches can detect complex patterns from extensive datasets, and may therefore improve prediction accuracy, giving a better understanding of drivers of malaria in children, leading to informed targeted interventions. Methods We conducted a cross-sectional study with 693 caregiver-child pairs from high-burden Internally Displaced Persons (IDPs) Camps in Nigeria. Sociodemographic, household conditions, malaria knowledge and prevention practices data were collected alongside Rapid Diagnostic Test (RDT) results. 70:30 split data is used to train and evaluate four ML models namely Logistic Regression (LR), Decision Tree (DT), Random Forest (RF) and Gradient Boosting Machine (GBM). The performance of the model was evaluated based on Area Under the Curve (AUC), precision, recall, and F1-score as well as variable importance to reveal key predictors. Results Malaria prevalence was 68.5%, and significant associations were observed with caregiver gender, education and housing conditions. Male caregivers had reduced odds of malaria positivity (aOR = 0.44, p < 0.001), and Mud walls conferred protection against malaria positive cases (aOR = 0.60, p = 0.002). Random Forest (AUC = 0.89) was the top performing model identifying caregiver occupation (15. 7% importance), and residential camp (14.7% importance) as leading predictors. GBM (AUC = 0.87) and LR (AUC = 0.82) were next, with DT (AUC = 0.78) had the lowest AUC value. There was a clear knowledge gap, with 60.3% of caregivers without Malaria prevention knowledge. Conclusion Malaria risk prediction is improved by machine learning and RF performs better. Important modifiable variables include housing conditions, caregiver education, and localized vector control. This study recommends a precision public health approach integrating ML within surveillance for real-time risk mapping and resource optimization in high-burden areas.

Version published to 10.21203/rs.3.rs-7352919/v1 on Research Square
Sep 11, 2025

Machine Learning Classification of Favorable vs Unfavorable Tuberculosis Treatment Outcomes Using Clinical and Sociodemographic Data from Brazil’s SINAN-TB (2001–2023)

This article has 9 authors:
1. Maicon Herverton Lino Ferreira da Silva Barros
2. José Mário Nunes da Silva
3. Virginia Vilhena
4. José Roberto Ferreira Melo
5. Larissa Souza França
6. Lucia Rolim Santana de Freitas
7. Lívia Teixeira de Souza Maia
8. Patricia Takako Endo
9. Walter Massa Ramalho
This article has no evaluationsLatest version Sep 3, 2025
A multi-stage machine learning framework for stepwise prediction of tuberculosis treatment outcomes: Integrating gradient boosted decision trees and feature-level analysis for clinical decision support

This article has 4 authors:
1. Linfeng Wang
2. Susana Campino
3. Taane G. Clark
4. Jody E. Phelan
This article has no evaluationsLatest version Oct 19, 2025
Harnessing Machine Learning for Antimicrobial Resistance Surveillance in Zimbabwe

This article has 10 authors:
1. Liberty Mutahwa
2. Hilary Takunda Takawira
3. Tinashe Muteveri
4. Perkins Watambwa
5. Delson Chikobvu
6. Whatmore Sengweni
7. Claris Siyamayambo
8. Justice Kasiroori
9. Lyson Chaka
10. Farai Mlambo
This article has no evaluationsLatest version Sep 23, 2025

Discuss this preprint

Listed in

Abstract

Article activity feed

Related articles

Machine Learning Classification of Favorable vs Unfavorable Tuberculosis Treatment Outcomes Using Clinical and Sociodemographic Data from Brazil’s SINAN-TB (2001–2023)

A multi-stage machine learning framework for stepwise prediction of tuberculosis treatment outcomes: Integrating gradient boosted decision trees and feature-level analysis for clinical decision support

Harnessing Machine Learning for Antimicrobial Resistance Surveillance in Zimbabwe