Development and Deployment of a Machine Learning–Based Predictive Model for COVID- 19 Infection Using Patient Demographic and Symptom Data in Nigeria

Read the full article See related articles

Discuss this preprint

Start a discussion What are Sciety discussions?

Listed in

This article is not in any list yet, why not save it to one of your lists.
Log in to save this article

Abstract

Background Timely identification of COVID-19 cases is critical for clinical management and public health control, particularly in resource-limited settings. While RT-PCR testing remains the gold standard, limited accessibility during peak transmission highlights the role of predictive tools. Objective This study aimed to develop and deploy a machine learning–based predictive model for COVID-19 infection using demographic and symptom data from patients in Nigeria. Methods Patient records were preprocessed, including cleaning, encoding of categorical variables, and feature selection. Logistic regression, random forest, and gradient boosting models were compared using ten-fold cross-validation. Model performance was assessed using the area under the receiver operating characteristic curve (AUROC), sensitivity, specificity, precision, and F1-score. The best-performing model was deployed as a web-based decision-support tool via R Shiny. Results A total of 43,442 patient records were included, with 3712 (8.5%) confirmed positive cases. COVID-19 positivity was significantly associated with male sex, older age, and symptoms such as cough, fever, and dyspnea (all p < .05). Logistic regression achieved an AUROC of 0.93 , sensitivity of 0.91 , specificity of 0.76, and F1-score of 0.95. The model demonstrated strong recall but a slightly low specificity. Conclusion We developed and deployed a lightweight, interpretable predictive model for COVID-19, available as a Shiny application (http://bit.ly/41LxW9p). This tool may be externally validated and subsequently proposed to support rapid triage and early decision-making in resource-constrained settings.

Article activity feed