Forecasting COVID-19 cases at the Amazon region: a comparison of classical and machine learning models
This article has been Reviewed by the following groups
Discuss this preprint
Start a discussion What are Sciety discussions?Listed in
- Evaluated articles (ScreenIT)
Abstract
BACKGROUND
Since the first reports of COVID-19, decision-makers have been using traditional epidemiological models to predict the days to come. However, the enhancement of computational power, the demand for adaptable predictive frameworks, the short past of the disease, and uncertainties related to input data and prediction rules, also make other classical and machine learning techniques viable options.
OBJECTIVE
This study investigates the efficiency of six models in forecasting COVID-19 confirmed cases with 17 days ahead. We compare the models autoregressive integrated moving average (ARIMA), Holt-Winters, support vector regression (SVR), k-nearest neighbors regressor (KNN), random trees regressor (RTR), seasonal linear regression with change-points (Prophet), and simple logistic regression (SLR).
MATERIAL AND METHODS
We implement the models to data provided by the health surveillance secretary of Amapáa, a Brazilian state fully carved in the Amazon rainforest, which has been experiencing high infection rates. We evaluate the models according to their capacity to forecast in different historical scenarios of the COVID-19 progression, such as exponential increases, sudden decreases, and stability periods of daily cases. To do so, we use a rolling forward splitting approach for out-of-sample validation. We employ the metrics RMSE, R-squared, and sMAPE in evaluating the model in different cross-validation sections.
FINDINGS
All models outperform SLG, especially Holt-Winters, that performs satisfactorily in all scenarios. SVR and ARIMA have better performances in isolated scenarios. To implement the comparisons, we have created a web application, which is available online.
CONCLUSION
This work represents an effort to assist the decision-makers of Amapá in future decisions to come, especially under scenarios of sudden variations in the number of confirmed cases of Amapá, which would be caused, for instance, by new contamination waves or vaccination. It is also an attempt to highlight alternative models that could be used in future epidemics.
Article activity feed
-
SciScore for 10.1101/2020.10.09.332908: (What is this?)
Please note, not all rigor criteria are appropriate for all manuscripts.
Table 1: Rigor
Institutional Review Board Statement not detected. Randomization not detected. Blinding not detected. Power Analysis not detected. Sex as a biological variable not detected. Table 2: Resources
No key resources detected.
Results from OddPub: We did not detect open data. We also did not detect open code. Researchers are encouraged to share open data when possible (see Nature blog).
Results from LimitationRecognizer: An explicit section about the limitations of the techniques employed in this study was not found. We encourage authors to address study limitations.Results from TrialIdentifier: No clinical trial numbers were referenced.
Results from Barzooka: We did not find any issues relating to the usage of bar …
SciScore for 10.1101/2020.10.09.332908: (What is this?)
Please note, not all rigor criteria are appropriate for all manuscripts.
Table 1: Rigor
Institutional Review Board Statement not detected. Randomization not detected. Blinding not detected. Power Analysis not detected. Sex as a biological variable not detected. Table 2: Resources
No key resources detected.
Results from OddPub: We did not detect open data. We also did not detect open code. Researchers are encouraged to share open data when possible (see Nature blog).
Results from LimitationRecognizer: An explicit section about the limitations of the techniques employed in this study was not found. We encourage authors to address study limitations.Results from TrialIdentifier: No clinical trial numbers were referenced.
Results from Barzooka: We did not find any issues relating to the usage of bar graphs.
Results from JetFighter: We did not find any issues relating to colormaps.
Results from rtransparent:- No conflict of interest statement was detected. If there are no conflicts, we encourage authors to explicit state so.
- No funding statement was detected.
- No protocol registration statement was detected.
-
