Can Social Media Data Be Utilized to Enhance Early Warning: Retrospective Analysis of the U.S. Covid-19 Pandemic
This article has been Reviewed by the following groups
Listed in
- Evaluated articles (ScreenIT)
Abstract
The U.S. needs early warning systems to help it contain the spread of infectious diseases. Conventional early warning systems use lab-test results or dynamic records to signal early warning signs. New early warning systems can supplement these data with indicators of public awareness like news articles and search queries. This study aims to explore the potential of utilizing social media data to enhance early warning of the COVID-19 outbreak. To demonstrate the feasibility, this study conducts a retrospective analysis and investigates more than 14 million related Twitter postings in the date range from January 20 to March 10, 2020. With the aid of natural language processing tools and machine learning classifiers, this study classifies each of these tweets into either a signal or a non-signal. In this study, a “signal” tweet implies that the user recognized the COVID-19 outbreak risk in the U.S. This study then proposes a parameter “signal ratio” to signal warning signs of the COVID-19 pandemic over periods. Results reveal that social media data and the signal ratio can detect the hazards ahead of the COVID-19 outbreak. This claim has been validated with a leading time of 16 days through the comparison to other referenced methods based on Google trends or media news.
Article activity feed
-
SciScore for 10.1101/2021.04.11.21255285: (What is this?)
Please note, not all rigor criteria are appropriate for all manuscripts.
Table 1: Rigor
NIH rigor criteria are not applicable to paper type.Table 2: Resources
Software and Algorithms Sentences Resources In this research context, all words in a tweet were represented with numerical information, and each tweet was mapped into a numerical vector for classification. E. TEXT CLASSIFICATION: After each tweet was converted to a vector of features using TF-IDF, we applied several machine learning classifiers provided by Scikit-learn Python library [53], including Random Forest (RF), Logistic Regression (LR), Support Vector Machine (SVM), and Naïve Bayes (NB) to build the pipeline for text classification. Scikit-learnsuggested: (scikit-learn, RRID:SCR_002577)Pythonsuggested: (IPython, RRID:SCR_00…SciScore for 10.1101/2021.04.11.21255285: (What is this?)
Please note, not all rigor criteria are appropriate for all manuscripts.
Table 1: Rigor
NIH rigor criteria are not applicable to paper type.Table 2: Resources
Software and Algorithms Sentences Resources In this research context, all words in a tweet were represented with numerical information, and each tweet was mapped into a numerical vector for classification. E. TEXT CLASSIFICATION: After each tweet was converted to a vector of features using TF-IDF, we applied several machine learning classifiers provided by Scikit-learn Python library [53], including Random Forest (RF), Logistic Regression (LR), Support Vector Machine (SVM), and Naïve Bayes (NB) to build the pipeline for text classification. Scikit-learnsuggested: (scikit-learn, RRID:SCR_002577)Pythonsuggested: (IPython, RRID:SCR_001658)Results from OddPub: We did not detect open data. We also did not detect open code. Researchers are encouraged to share open data when possible (see Nature blog).
Results from LimitationRecognizer: An explicit section about the limitations of the techniques employed in this study was not found. We encourage authors to address study limitations.Results from TrialIdentifier: No clinical trial numbers were referenced.
Results from Barzooka: We did not find any issues relating to the usage of bar graphs.
Results from JetFighter: We did not find any issues relating to colormaps.
Results from rtransparent:- Thank you for including a conflict of interest statement. Authors are encouraged to include this statement when submitting to a journal.
- Thank you for including a funding statement. Authors are encouraged to include this statement when submitting to a journal.
- No protocol registration statement was detected.
-