The COVID-19 Infodemic: The complex task of elevating signal and eliminating noise

This article has been Reviewed by the following groups

Read the full article

Abstract

In Situation Report #3 and 39 days before declaring COVID-19 a pandemic, the WHO declared a -19 infodemic. The volume of coronavirus tweets was far too great for one to find accurate or reliable information. Healthcare workers were flooded with which drowned the of valuable COVID-19 information. To combat the infodemic, physicians created healthcare-specific micro-communities to share scientific information with other providers. We analyzed the content of eight physician-created communities and categorized each message in one of five domains. We coded 1) an application programming interface to download tweets and their metadata in JavaScript Object Notation and 2) a reading algorithm using visual basic application in Excel to categorize the content. We superimposed the publication date of each tweet into a timeline of key pandemic events. Finally, we created NephTwitterArchive.com to help healthcare workers find COVID-19-related signal tweets when treating patients. We collected 21071 tweets from the eight hashtags studied. Only 9051 tweets were considered signal: tweets categorized into both a domain and subdomain. There was a trend towards fewer signal tweets as the pandemic progressed, with a daily median of 22% (IQR 0-42%. The most popular subdomain in Prevention was PPE (2448 signal tweets). In Therapeutics, Hydroxychloroquine/chloroquine wwo Azithromycin and Mechanical Ventilation were the most popular subdomains. During the active Infodemic phase (Days 0 to 49), a total of 2021 searches were completed in NephTwitterArchive.com, which was a 26% increase from the same time period before the pandemic was declared (Days −50 to −1). The COVID-19 Infodemic indicates that future endeavors must be undertaken to eliminate noise and elevate signal in all aspects of scientific discourse on Twitter. In the absence of any algorithm-based strategy, healthcare providers will be left with the nearly impossible task of manually finding high-quality tweets from amongst a tidal wave of noise.

Article activity feed

  1. SciScore for 10.1101/2021.01.19.21249936: (What is this?)

    Please note, not all rigor criteria are appropriate for all manuscripts.

    Table 1: Rigor

    Institutional Review Board Statementnot detected.
    Randomizationnot detected.
    Blindingnot detected.
    Power Analysisnot detected.
    Sex as a biological variablenot detected.

    Table 2: Resources

    Software and Algorithms
    SentencesResources
    We stored the translated tweets in Microsoft Excel and categorized each into one of five domains and one of 84 subdomains.
    Microsoft Excel
    suggested: (Microsoft Excel, RRID:SCR_016137)

    Results from OddPub: We did not detect open data. We also did not detect open code. Researchers are encouraged to share open data when possible (see Nature blog).


    Results from LimitationRecognizer: We detected the following sentences addressing limitations in the study:
    Two representative examples show the limitations and strengths of our algorithms. We believe the significant signal activity about Hydroxychloroquine/chloroquine wwo Azithromycin had more to do with a political misunderstanding of its efficacy (or lack thereof) rather than a legitimate scientific controversy about its effectiveness (14). Our algorithms could not distinguish political misunderstanding from legitimate scientific controversy and thus tweets about hydroxychloroquine were classified as signal and some were archived in NephTwitterArchive.com. We were surprised that neither vaccines or the drug Remdesivir were actively discussed. The lack of signal tweets for either subdomain may suggest that not enough scientific information and/or clinical experience was available to generate informative tweets (15). In this case, our algorithms identified these tweets as unsupported by external evidence and eliminated them from the archive.

    Results from TrialIdentifier: No clinical trial numbers were referenced.


    Results from Barzooka: We did not find any issues relating to the usage of bar graphs.


    Results from JetFighter: We did not find any issues relating to colormaps.


    Results from rtransparent:
    • Thank you for including a conflict of interest statement. Authors are encouraged to include this statement when submitting to a journal.
    • Thank you for including a funding statement. Authors are encouraged to include this statement when submitting to a journal.
    • No protocol registration statement was detected.

    About SciScore

    SciScore is an automated tool that is designed to assist expert reviewers by finding and presenting formulaic information scattered throughout a paper in a standard, easy to digest format. SciScore checks for the presence and correctness of RRIDs (research resource identifiers), and for rigor criteria such as sex and investigator blinding. For details on the theoretical underpinning of rigor criteria and the tools shown here, including references cited, please follow this link.