Application of Machine Learning Models in Predicting Malaria Prevalence in Nigeria: An Analysis of the 2015-2020 Demographic and Health Surveys

Read the full article See related articles

Listed in

This article is not in any list yet, why not save it to one of your lists.
Log in to save this article

Abstract

Malaria is a major public health problem, especially in sub-Saharan Africa and other developing countries, where majority of malaria cases and deaths occur. This study developed a machine learning (ML) model to accurately diagnose malaria in rural communities in Nigeria, based on patients’ symptoms and other clinical information, using low-cost and readily available diagnostic tools. The model was trained on a 2020 Nigerian Demographic and Health Surveys (NDHS) Program Geospatial Covariate datasets containing clinical information of patients in Nigeria. ML approaches were preferred over traditional statistical methods due to their ability to handle high-dimensional, non-linear relationships and interactions among a diverse set of variables. Regression based-algorithms were used to identify and predict patterns as a continuous outcome allowing finer-grained spatial and demographic insights than binary classification would predict. The models underwent rigorous validation using cross validation and holdout testing to assess generalizability and minimize overfitting. The closeness of the predicted malaria incidence scores and the experimental scores indicates the robustness of the ML model. The coefficient determination scores of Random Forest Regressor (RFR), Multiple Linear Regression (MLR), and Ridge models were 0.9937, 0.9916, and 0.9924 respectively for the test set. This demonstrates the competence of the models' prediction abilities. The RFR model's learning curve results showed a recurring pattern. The model's performance on the test dataset consistently improved as the volume of data increased. By shifting from reactive diagnostics to proactive risk prediction, health authorities can more effectively allocate resources, improve intervention timing, and reach underserved rural communities with precision.

Article activity feed