Ensemble Machine Learning for Malaria Diagnosis in Resource-Limited Settings Using Clinical and Demographic Features

Panashe Nyengera
Hilary Takawira
Farai Mlambo

Read the full article

Listed in

This article is not in any list yet, why not save it to one of your lists.

Abstract

Background

Sub-Saharan Africa continues to shoulder the heaviest burden of malaria. The 2024 WHO malaria report highlighted that Africa contributed an alarming 94% of the global cases and 95% of the deaths. In the WHO African region, progress towards elimination and management of malaria is hindered by weak health systems, and lack of traditional diagnostic methods such as microscopy and malaria rapid diagnostic tests (mRDT). The primary aim of the study is to develop a machine learning (ML) ensemble model for malaria diagnosis using clinical and demographic data, tailored for resource-limited settings.

Methods

A retrospective study was conducted using 637 patient records from Gutu Mission Hospital and Gweru Provincial Hospital in Zimbabwe. Clinical symptoms (fever, chills, abdominal pain, headache and diarrhea) and demographic features (age, gender, residence and travel history) were analysed. Data preprocessing included handling class imbalance using Synthetic Minority Oversampling Technique (SMOTE) and feature selection using Recursive feature elimination (RFE). Seven individual ML models including Logistic regression (LR), Random Forest (RF), Decision Trees (DT), Gradient Boosting (GB), K-Nearest Neighbor (KNN), Naive Bayes (NB) and XGBoost were trained and evaluated on the malaria dataset. The individual models were further combined to build, train and evaluate ensemble models such as Bagging, Stacking, Soft Voting and AdaBoost. Model performance was assessed using accuracy, precision, confusion matrices, recall and F1score and AUR-ROC metrics.

Results

Clinical symptoms (chills: p=0.001, fever: p=0.003, diarrhoea: p=0.01, abdominal pain: p<0.001) were statistically significant predictors of malaria. Of the demographic factors, only travel history (p=0.02) showed significant association with malaria. Among the seven individual ML models, GB achieved the highest predictive performance (Accuracy = 0.94), followed by RF (Accuracy = 0.94%) and XGBoost (Accuracy = 0.93%). The stacking ensemble model outperformed all individual ML models and other ensemble models (bagging, soft voting and adaBoost) achieving accuracy = 0.96, precision = 0.95, recall = 0.98, F1 score= 0.96 and AUC-ROC = 0.98.

Conclusion

This study demonstrates that ML particularly ensemble models can be used to significantly improve malaria diagnosis. The integration of these models into a web-based application could provide a scalable and accessible diagnostic tool for healthcare workers in resource limited settings.

Version published to 10.1101/2025.08.03.25332923 on medRxiv
Aug 6, 2025

Evaluating the performance of an artificial intelligence-based electronic reader for malaria rapid diagnostic tests across four sub-Saharan African countries

This article has 16 authors:
1. Kim A. Lindblade
2. Corine Ngufor
3. William Yavo
4. Sunday Atobatele
5. Arthur Mpimbaza
6. Nelson Ssewante
7. Ese Akpiroroh
8. Abibatou Konaté-Toure
9. Idelphonse Ahogni
10. Augustin Kpemasse
11. Antoine Mea Tanoh
12. Godwin Ntadom
13. Jimmy Opigo
14. Stephanie Zobrist
15. Kevin Griffith
16. Michael Humes
This article has no evaluationsLatest version Jul 30, 2025
Accuracy of recording of malaria rapid diagnostic tests in Côte d’Ivoire

This article has 12 authors:
1. Valérie Akoua Bedia-Tanoh
2. Abibatou Konaté-Touré
3. Orphée M.A. Kangah Kouakou
4. Anatole N. N. Mian
5. Antoine M. Tanoh
6. Michael Humes
7. Kevin Griffith
8. John J. Aponte
9. Emily Hilton
10. Shawna Cooper
11. Kim A. Lindblade
12. Yavo William
This article has no evaluationsLatest version Sep 3, 2025
Machine Learning Classification of Favorable vs Unfavorable Tuberculosis Treatment Outcomes Using Clinical and Sociodemographic Data from Brazil’s SINAN-TB (2001–2023)

This article has 9 authors:
1. Maicon Herverton Lino Ferreira da Silva Barros
2. José Mário Nunes da Silva
3. Virginia Vilhena
4. José Roberto Ferreira Melo
5. Larissa Souza França
6. Lucia Rolim Santana de Freitas
7. Lívia Teixeira de Souza Maia
8. Patricia Takako Endo
9. Walter Massa Ramalho
This article has no evaluationsLatest version Sep 3, 2025

Listed in

Abstract

Background

Methods

Results

Conclusion

Article activity feed

Related articles

Evaluating the performance of an artificial intelligence-based electronic reader for malaria rapid diagnostic tests across four sub-Saharan African countries

Accuracy of recording of malaria rapid diagnostic tests in Côte d’Ivoire

Machine Learning Classification of Favorable vs Unfavorable Tuberculosis Treatment Outcomes Using Clinical and Sociodemographic Data from Brazil’s SINAN-TB (2001–2023)