Applying machine-learning to rapidly analyze large qualitative text datasets to inform the COVID-19 pandemic response: comparing human and machine-assisted topic analysis techniques

Abstract

Machine-assisted topic analysis (MATA) uses artificial intelligence methods to help qualitative researchers analyze large datasets. This is useful for researchers to rapidly update healthcare interventions during changing healthcare contexts, such as a pandemic. We examined the potential to support healthcare interventions by comparing MATA with “human-only” thematic analysis techniques on the same dataset (1,472 user responses from a COVID-19 behavioral intervention).

Methods

In MATA, an unsupervised topic-modeling approach identified latent topics in the text, from which researchers identified broad themes. In human-only codebook analysis, researchers developed an initial codebook based on previous research that was applied to the dataset by the team, who met regularly to discuss and refine the codes. Formal triangulation using a “convergence coding matrix” compared findings between methods, categorizing them as “agreement”, “complementary”, “dissonant”, or “silent”.

Results

Human analysis took much longer than MATA (147.5 vs. 40 h). Both methods identified key themes about what users found helpful and unhelpful. Formal triangulation showed both sets of findings were highly similar. The formal triangulation showed high similarity between the findings. All MATA codes were classified as in agreement or complementary to the human themes. When findings differed slightly, this was due to human researcher interpretations or nuance from human-only analysis.

Discussion

Results produced by MATA were similar to human-only thematic analysis, with substantial time savings. For simple analyses that do not require an in-depth or subtle understanding of the data, MATA is a useful tool that can support qualitative researchers to interpret and analyze large datasets quickly. This approach can support intervention development and implementation, such as enabling rapid optimization during public health emergencies.

SciScore for 10.1101/2022.05.12.22274993: (What is this?)

Please note, not all rigor criteria are appropriate for all manuscripts.

Table 1: Rigor

Ethics	Consent: Participants: Inclusion criteria were users of the Germ Defence website who were over the age of 18 and able to give informed consent. IRB: Ethical approval was granted by the University of Southampton Psychology Ethics Committee (ID: 56445).
Sex as a biological variable	not detected.
Randomization	not detected.
Blinding	not detected.
Power Analysis	not detected.

Table 2: Resources

No key resources detected.

Results from OddPub: Thank you for sharing your data.

Results from LimitationRecognizer: We detected the following sentences addressing limitations in the study:

However, since the goal of this analysis, and the use case for MATA in general, was to rapidly extract headline insights, this limitation can be mostly overlooked. Nevertheless, researchers should be mindful of these potential issues when they come to interpret the output of the AI. Due to these considerations, MATA could potentially be seen as a less interpretive method than human-only analysis that is suitable for more descriptive studies of large datasets. Indeed, the concept of information power recommends larger samples for studies with broader, atheoretical, more exploratory aims [26]. In order to complete the human-only analysis of a sample of this size, a codebook was created based on previous Germ Defence research, and six research assistants needed to be trained in qualitative analysis. It would not have been feasible to conduct a purely inductive thematic analysis using a large number of coders due to differences in how individuals would interpret and label the data. Other methods of coding large-scale data, such as crowdsourcing though Amazon Mechanical Turk, have been shown to be successful when coding deductively into pre-determined categories [27,28,29]. However, in the absence of these categories, such as in more inductive approaches or studies with more exploratory aims, there have previously been few options available to researchers other than to perform human analyses on limited sample sizes. Approaches such as MATA could be a valuable tool for enabling lar...

Results from TrialIdentifier: No clinical trial numbers were referenced.

Results from Barzooka: We did not find any issues relating to the usage of bar graphs.

Results from JetFighter: We did not find any issues relating to colormaps.

Results from rtransparent:

Thank you for including a conflict of interest statement. Authors are encouraged to include this statement when submitting to a journal.
Thank you for including a funding statement. Authors are encouraged to include this statement when submitting to a journal.
No protocol registration statement was detected.

Results from scite Reference Check: We found no unreliable references.

Read the original source

Applying machine-learning to rapidly analyze large qualitative text datasets to inform the COVID-19 pandemic response: comparing human and machine-assisted topic analysis techniques

This article has been Reviewed by the following groups

Listed in

Abstract

Methods

Results

Discussion

Article activity feed

Exploring Patterns in Online Discussions into the Lingering Impact of COVID-19, Two Years On

Developing and Testing a Framework for Coding General Practitioners' Free-Text Diagnoses in Electronic Medical Records - A Reliability Study for Generating Training Data in Natural Language Processing

Large Language Model in Medical Information Extraction from Titles and Abstracts with Prompt Engineering Strategies: A Comparative Study of GPT-3.5 and GPT-4

This article has been Reviewed by the following groups

Listed in

Abstract

Methods

Results

Discussion

Article activity feed

Related articles

Exploring Patterns in Online Discussions into the Lingering Impact of COVID-19, Two Years On

Developing and Testing a Framework for Coding General Practitioners' Free-Text Diagnoses in Electronic Medical Records - A Reliability Study for Generating Training Data in Natural Language Processing

Large Language Model in Medical Information Extraction from Titles and Abstracts with Prompt Engineering Strategies: A Comparative Study of GPT-3.5 and GPT-4