Forecasting COVID-19 cases using Machine Learning models

This article has been Reviewed by the following groups

Read the full article

Abstract

As of April 26, 2020, more than 2,994,958 cases of COVID-19 infection have been confirmed globally, raising a challenging public health issue. A predictive model of the disease would help allocate medical resources and determine social distancing measures more efficiently. In this paper, we gathered case data from Jan 22, 2020 to April 14 for 6 countries to compare different models’ proficiency in COVID-19 cases prediction. We assessed the performance of 3 machine learning models including hidden Markov chain model (HMM), hierarchical Bayes model, and long-short-term-memory model (LSTM) using the root-mean-square error (RMSE). The LSTM model had the consistently smallest prediction error rates for tracking the dynamics of incidents cases in 4 countries. In contrast, hierarchical Bayes model provided the most realistic prediction with the capability of identifying a plateau point in the incidents growth curve.

Article activity feed

  1. SciScore for 10.1101/2020.07.02.20145474: (What is this?)

    Please note, not all rigor criteria are appropriate for all manuscripts.

    Table 1: Rigor

    Institutional Review Board Statementnot detected.
    Randomizationnot detected.
    Blindingnot detected.
    Power Analysisnot detected.
    Sex as a biological variablenot detected.

    Table 2: Resources

    No key resources detected.


    Results from OddPub: Thank you for sharing your code and data.


    Results from LimitationRecognizer: We detected the following sentences addressing limitations in the study:
    Overall, a caveat of our study is that none of the countries we studied reached a plateau in their infection count. To complete a holistic search of models, given more time, we should compare across more models, train on a larger data set, and increase the prediction window. Future work would also include integrating other factors that play a role in the spread of infection such as geographic location, population density, GDP, number of hospitals and doctors etc. In addition, applying ensemble method to average the results from LSTM model and hierarchical Bayes model may also provide a more accurate prediction model.

    Results from TrialIdentifier: No clinical trial numbers were referenced.


    Results from Barzooka: We did not find any issues relating to the usage of bar graphs.


    Results from JetFighter: We did not find any issues relating to colormaps.


    Results from rtransparent:
    • Thank you for including a conflict of interest statement. Authors are encouraged to include this statement when submitting to a journal.
    • Thank you for including a funding statement. Authors are encouraged to include this statement when submitting to a journal.
    • No protocol registration statement was detected.

    About SciScore

    SciScore is an automated tool that is designed to assist expert reviewers by finding and presenting formulaic information scattered throughout a paper in a standard, easy to digest format. SciScore checks for the presence and correctness of RRIDs (research resource identifiers), and for rigor criteria such as sex and investigator blinding. For details on the theoretical underpinning of rigor criteria and the tools shown here, including references cited, please follow this link.