Early detection of fraudulent COVID-19 products from Twitter chatter
This article has been Reviewed by the following groups
Listed in
- Evaluated articles (ScreenIT)
Abstract
Social media have served as lucrative platforms for misinformation and for promoting fraudulent products for the treatment, testing and prevention of COVID-19. This has resulted in the issuance of many warning letters by the United States Food and Drug Administration (FDA). While social media continue to serve as the primary platform for the promotion of such fraudulent products, they also present the opportunity to identify these products early by employing effective social media mining methods. In this study, we employ natural language processing and time series anomaly detection methods for automatically detecting fraudulent COVID-19 products early from Twitter. Our approach is based on the intuition that increases in the popularity of fraudulent products lead to corresponding anomalous increases in the volume of chatter regarding them. We utilized an anomaly detection method on streaming COVID-19-related Twitter data to detect potentially anomalous increases in mentions of fraudulent products. Our unsupervised approach detected 34/44 (77.3%) signals about fraudulent products earlier than the FDA letter issuance dates, and an additional 6/44 (13.6%) within a week following the corresponding FDA letters. Our proposed method is simple, effective and easy to deploy, and do not require high performance computing machinery unlike deep neural network-based methods.
Article activity feed
-
SciScore for 10.1101/2022.05.09.22274776: (What is this?)
Please note, not all rigor criteria are appropriate for all manuscripts.
Table 1: Rigor
NIH rigor criteria are not applicable to paper type.Table 2: Resources
No key resources detected.
Results from OddPub: We did not detect open data. We also did not detect open code. Researchers are encouraged to share open data when possible (see Nature blog).
Results from LimitationRecognizer: We detected the following sentences addressing limitations in the study:4.2 Limitations: There are several potential limitations of the proposed approach. First, it requires data that is not rate limited (eg., data from the standard Twitter streaming API). Anomalous increases may not be detectable from rate limited streams since large increases in volume are likely to be dampened by the APIs. For real-time …
SciScore for 10.1101/2022.05.09.22274776: (What is this?)
Please note, not all rigor criteria are appropriate for all manuscripts.
Table 1: Rigor
NIH rigor criteria are not applicable to paper type.Table 2: Resources
No key resources detected.
Results from OddPub: We did not detect open data. We also did not detect open code. Researchers are encouraged to share open data when possible (see Nature blog).
Results from LimitationRecognizer: We detected the following sentences addressing limitations in the study:4.2 Limitations: There are several potential limitations of the proposed approach. First, it requires data that is not rate limited (eg., data from the standard Twitter streaming API). Anomalous increases may not be detectable from rate limited streams since large increases in volume are likely to be dampened by the APIs. For real-time fraudulent product candidate detection, deployment needs to be on streaming data, although it is also possible to periodically run the anomaly detection scripts on stored, static data. Second, we were only able to calculate the percentage of early detection within our given sample, and based on the current data, we were unable to realistically estimate confidence intervals for the percentage values reported. Third, the anomaly detection approach relies on characteristic abrupt increases in chatter volumes about a given topic. It is possible that some fraudulent products may gain popularity gradually, causing the normalized counts to never go beyond the standard deviation threshold. In such cases, varying the window size (eg., using 7-day moving averages) and/or lowering the standard deviation thresholds may improve the detection capability of the method. However, lowering the standard deviation threshold is also likely to result in larger numbers of false positives—an aspect that we did not take into account in this study. We believe that not taking false positives into account in the current study is justifiable since in practical settings, al...
Results from TrialIdentifier: No clinical trial numbers were referenced.
Results from Barzooka: We did not find any issues relating to the usage of bar graphs.
Results from JetFighter: We did not find any issues relating to colormaps.
Results from rtransparent:- Thank you for including a conflict of interest statement. Authors are encouraged to include this statement when submitting to a journal.
- Thank you for including a funding statement. Authors are encouraged to include this statement when submitting to a journal.
- No protocol registration statement was detected.
Results from scite Reference Check: We found no unreliable references.
-