Predicting the Epidemic Curve of the Coronavirus (SARS-CoV-2) Disease (COVID-19) Using Artificial Intelligence

This article has been Reviewed by the following groups

Read the full article

Abstract

The aim of our study was to predict the epidemic curves (daily new cases) of COVID-19 pandemic using Artificial Intelligence (AI)-based Recurrent Neural Networks (RNNs), then to compare and validate the predicted models with the observed data. We used the publicly available datasets from the World Health Organization and Johns Hopkins University to create a training dataset, then we used RNNs with gated recurring units (Long Short-Term Memory) to create two prediction models. Information collected in the first t time-steps were aggregated with a fully connected (dense) neural network layer and a consequent regression output layer to determine the next predicted value. We also used Root Mean Squared Logarithmic Errors (RMSLE) to compare the predicted models with the observed data. The result of our study underscores that the COVID-19 pandemic is a propagated source epidemic, therefore repeated peaks on the epidemic curve are to be anticipated. Besides, the errors between the predicted and validated data and trends seems to be low. The influence of this pandemic is significant worldwide and has already impacted our daily life. Decision makers must be aware, that even if strict public health measures are executed and sustained, future peaks of infections are possible.

Article activity feed

  1. SciScore for 10.1101/2020.04.17.20069666: (What is this?)

    Please note, not all rigor criteria are appropriate for all manuscripts.

    Table 1: Rigor

    NIH rigor criteria are not applicable to paper type.

    Table 2: Resources

    No key resources detected.


    Results from OddPub: We did not detect open data. We also did not detect open code. Researchers are encouraged to share open data when possible (see Nature blog).


    Results from LimitationRecognizer: We detected the following sentences addressing limitations in the study:
    Nevertheless, the are some limitations in our study. As the nature of SARS-COV-2 is relatively unknown or dynamic, and it is prone to mutations, the prediction of the spread of the pandemic is not an easy mission. Factors that influenced the reported new cases per day, for example, the efficiency of reporting, the different quality and timing of public health measures, country-specific age-pyramid, and chronic disease burden of the population were not included in our training data set due to lack of reliable data. We did not investigate the number of deaths and recoveries, as we found no reliable data at that time. Similarly, the data regarding diagnostic tests performed per country, or death rates were omitted, given they are highly influenced by the countries’ economic wellbeing, health care systems, facilities and capacities, and other factors [34,35]. There are lots of unforeseen uncertainties and coincidences which could not be implemented in our model, for example, there were days when a large number of people have been diagnosed with COVID-19 on one day (for example in care homes in France or Hungary) that caused a large increase in the number of the daily new cases [16]. To summarize, the COVID-19 disease is a global health challenge, which forced the WHO to declare it a “public health emergency of international concern on 30/01/2020” and later as a global pandemic [18]. The influence of this global epidemic has dug deep into the day-to-day conduct of everyone, with u...

    Results from TrialIdentifier: No clinical trial numbers were referenced.


    Results from Barzooka: We did not find any issues relating to the usage of bar graphs.


    Results from JetFighter: We did not find any issues relating to colormaps.


    Results from rtransparent:
    • Thank you for including a conflict of interest statement. Authors are encouraged to include this statement when submitting to a journal.
    • Thank you for including a funding statement. Authors are encouraged to include this statement when submitting to a journal.
    • No protocol registration statement was detected.

    About SciScore

    SciScore is an automated tool that is designed to assist expert reviewers by finding and presenting formulaic information scattered throughout a paper in a standard, easy to digest format. SciScore checks for the presence and correctness of RRIDs (research resource identifiers), and for rigor criteria such as sex and investigator blinding. For details on the theoretical underpinning of rigor criteria and the tools shown here, including references cited, please follow this link.