Social media study of public opinions on potential COVID-19 vaccines: informing dissent, disparities, and dissemination

Curation statements for this article:
  • Curated by eLife

    eLife logo

    Evaluation Summary:

    This study uses social media data (namely twitter) to analyse factors of covid-vaccine acceptance. It first trains a classifier to detect whether a tweets pro-vaccine, neutral, or against. Using then a large corpus of accounts, it investigates multiple factors explaining this position in a light counterfactual analysis. The central finding is that the most socioeconomically disadvantaged groups are more likely to hold polarized opinions on COVID-19 vaccines; other findings inclduing that personal pandemic experience has an important impact on acceptance, or that interest in politics modulates acceptance. This study a good example of what machine learning can do with social media data; however it is also a good example of the high data-demands and limitations of a machine learning approach. The correlations found are plausible but the causal implications are not evidenced strongly enough to guide public policy.

    (This preprint has been reviewed by eLife. We include the public reviews from the reviewers here; the authors also receive private feedback with suggested changes to the manuscript. Reviewer #1 and Reviewer #2 agreed to share their names with the authors.)

This article has been Reviewed by the following groups

Read the full article

Abstract

No abstract available

Article activity feed

  1. Evaluation Summary:

    This study uses social media data (namely twitter) to analyse factors of covid-vaccine acceptance. It first trains a classifier to detect whether a tweets pro-vaccine, neutral, or against. Using then a large corpus of accounts, it investigates multiple factors explaining this position in a light counterfactual analysis. The central finding is that the most socioeconomically disadvantaged groups are more likely to hold polarized opinions on COVID-19 vaccines; other findings inclduing that personal pandemic experience has an important impact on acceptance, or that interest in politics modulates acceptance. This study a good example of what machine learning can do with social media data; however it is also a good example of the high data-demands and limitations of a machine learning approach. The correlations found are plausible but the causal implications are not evidenced strongly enough to guide public policy.

    (This preprint has been reviewed by eLife. We include the public reviews from the reviewers here; the authors also receive private feedback with suggested changes to the manuscript. Reviewer #1 and Reviewer #2 agreed to share their names with the authors.)

  2. Reviewer #1 (Public Review):

    In this study, the authors use social media data to investigate the determinants of acceptance of covid-19 vaccines.

    The strengths of the study are the amount and diversity of the population sample that comes from the use of social media as well as the ability to tease out various factors contributing to the vaccine acceptance, which the authors pursue in a counterfactual-based causal analysis. The study suffers however from imperfect evidence with regards to the causal interpretation of the findings, as the methods used do not have strong control for confounding effects: they are based on an output model with a multinomial regression, which is strongly parametric. Also, the authors note a discrepancy in the findings with a prior study based on surveys, but do not explore the potential reasons behind this discrepancy which casts a doubt on the overall validity of the findings.

  3. Reviewer #2 (Public Review):

    Fundamentally it's a data analysis article where the data collected are either stated or inferred. The level of inference is high which makes the data quality something to be careful about. As a result, the generalisability of the conclusions is also uncertain - the analysts only adjusted for pre-selected variables and there is no indication that other factors that influence propensity towards vaccine opinions were equally found in the groups identified.

    It's a good example of what machine learning can do with these types of data; however it's also a good example of the high data-demands of machine learning to reach its potential.

    It is good that the researchers tinkered with ways of processing the data to see if the results were consistent (they were generally consistent), and that's probably due to the type of data collected.

    I don't find that the researchers have exaggerated the generalisability of the results too much, but they could be more explicit about acknowledging that this sample of Twitter users may not be like most Americans. So the results show relative trends in a sub-sample of Twitter users, rather than strongly suggest what formal public health policy should be.

    As such I would say it's most like a cross-sectional survey using a sample of convenience rather than another study design.

  4. SciScore for 10.1101/2020.12.12.20248070: (What is this?)

    Please note, not all rigor criteria are appropriate for all manuscripts.

    Table 1: Rigor

    NIH rigor criteria are not applicable to paper type.

    Table 2: Resources

    No key resources detected.


    Results from OddPub: We did not detect open data. We also did not detect open code. Researchers are encouraged to share open data when possible (see Nature blog).


    Results from LimitationRecognizer: We detected the following sentences addressing limitations in the study:
    Our current study has limitations. The public opinions of some (less populated) states cannot be reflected due to the inadequate data. The findings could be further validated in other populations. However, our study broadly captures the public opinions on the potential vaccines for COVID-19 on Twitter. By aggregating the opinions, we find a lower acceptance level in the Southeast part of the U.S. The changes of the proportions of different opinion groups correspond roughly to the major pandemic-related events. We show the hypothesized predictive effects of the characteristics of the people in predicting pro-vaccine, vaccine-hesitant, and anti-vaccine group. For example, the socioeconomically disadvantaged groups have a relatively more polarized attitude towards the potential vaccines. The personal pandemic experience and the county-level pandemic severity perception shape the opinions. Specifically, the anti-vaccine opinion is the strongest among the people who have the worst personal pandemic experience, and the vaccine-hesitancy is the strongest in the areas that have the worst pandemic severity perception. Using counterfactual analyses, we find that people are most concerned about the safety, effectiveness and politics regarding potential COVID-19 vaccines, and improving personal experience with COVID-19 increases the vaccine acceptance level. Our results can guide and support policymakers making more effective distribution policies and strategies. First, more efforts of d...

    Results from TrialIdentifier: No clinical trial numbers were referenced.


    Results from Barzooka: We did not find any issues relating to the usage of bar graphs.


    Results from JetFighter: We did not find any issues relating to colormaps.


    Results from rtransparent:
    • Thank you for including a conflict of interest statement. Authors are encouraged to include this statement when submitting to a journal.
    • Thank you for including a funding statement. Authors are encouraged to include this statement when submitting to a journal.
    • No protocol registration statement was detected.

    About SciScore

    SciScore is an automated tool that is designed to assist expert reviewers by finding and presenting formulaic information scattered throughout a paper in a standard, easy to digest format. SciScore checks for the presence and correctness of RRIDs (research resource identifiers), and for rigor criteria such as sex and investigator blinding. For details on the theoretical underpinning of rigor criteria and the tools shown here, including references cited, please follow this link.