Machine Learning Analysis of COVID19 Transmission Dynamics Demographic Risk and Contact Tracing Outcomes in Nigeria

Read the full article See related articles

Discuss this preprint

Start a discussion What are Sciety discussions?

Listed in

This article is not in any list yet, why not save it to one of your lists.
Log in to save this article

Abstract

The COVID-19 pandemic has posed significant challenges to developing countries like Nigeria due to limited resources. Accurate prediction of disease spread is crucial for effective containment measures. This study investigates the application of statistical and machine learning (ML) techniques in modelling and predicting COVID-19 cases in Nigeria, using data from January 2020 through December 2021. By analyzing demographic data (age, gender, location), symptom patterns, and contact tracing information, we seek to identify correlations and temporal trends associated with disease transmission. The datasets, obtained from the National Centre for Disease Control (NCDC), were cleaned before statistical analyses were carried out with Pearson’s Correlation, Analysis of Variance, and Cramer’s V Correlation. Prediction was carried out using the random forest (RF) classification model, implemented in Python's scikit learn library. Key findings include (1) 94.97% of confirmed contacts tested positive, underscoring high transmission rates; (2) occupations like healthcare workers and students were high-risk groups; and (3) the RF model achieved 87% accuracy in classifying source cases, though it struggled with minority classes. These can inform evidence-based policymaking and contribute to mitigating the impact of future outbreaks. A limitation of this study is the dependence on the accuracy of the NCDC data.

Article activity feed