British news media representations of mpox during the 2022 and 2024 outbreaks: a mixed-methods analysis using corpus linguistics
This article has been Reviewed by the following groups
Discuss this preprint
Start a discussion What are Sciety discussions?Listed in
- Evaluated articles (PREreview)
Abstract
Background
Since 2022, over 100,000 people across 100 countries have been diagnosed with mpox (formerly monkeypox, renamed by the World Health Organization (WHO) in November 2022). News media plays a central role in outbreaks, disseminating information and shaping public discourse. Corpus linguistic approaches to massive language datasets can reveal how such outbreaks are represented but remain under-used in public health. Using these methods, we investigated representations of mpox in British news media during the 2022 and 2024 outbreaks.
Methods
We analysed the 83-billion-word English Trends corpus in SketchEngine, quantifying use of “ monkeypox ” and “ mpox” in 2022–2024, and applying Corpus-Assisted Discourse Studies to compare news media content from 2022 and 2024 (peak incidence periods), comprising 1.2 billion and 500 million words, respectively. Using corpus linguistic tools, we explored the contexts in which mpox and monkeypox occurred, assessing shifts in representation.
Findings
Monthly use of “monkeypox” peaked at 0.07 occurrences per million words (n=6591) in May 2022, dropping by 99.8% by November 2022. Grammatical and lexical analysis of 2022 reporting found frequent attribution blame for transmission, particularly to gay and bisexual men who have sex with men (GBMSM). In 2024, coverage adopted more neutral language, largely avoiding stigmatisation.
Interpretation
UK news media reporting on mpox shifted from stigmatising language in 2022, often targeting GBMSM, to more neutral and inclusive coverage in 2024. The WHO-endorsed nomenclature change may have contributed, illustrating the impact of such interventions. This study demonstrates the value of corpus methods in tracking linguistic representations of infectious disease outbreaks.
Funding
This work is part of the VERDI project (101045989) which is funded by the European Union. Views and opinions expressed are however those of the authors only and do not necessarily reflect those of the European Union. BM, MLP and ST are partly funded through a Wellcome Accelerator Award held by ST (316319/Z/24/Z).
Research in context
Evidence before this study
We reviewed existing literature on media representations of mpox, and other infectious diseases, focusing on stigma, framing, and public perception. We searched academic databases, including studies that examined media discourse and linguistic framing, and those using qualitative or corpus-based methods. While some explored stigma in mpox media coverage, few studies applied computational (computer-based) linguistic analysis to large-scale media datasets, or compared media narratives across different outbreak periods.
Added value of this study
This study is the first to use a ‘big data’ approach to explore representations of mpox. It uses corpus linguistic methods to analyse over two billion words of British news media content across two mpox outbreaks. It reveals a shift from stigmatising language in 2022—often targeting specific communities—to more neutral and inclusive reporting in 2024. The findings demonstrate how media language evolves in response to public health guidance and highlight the potential of corpus linguistic methods to uncover patterns in public discourse.
Implications of all the available evidence
Media language plays a powerful role in shaping public understanding and attitudes during health emergencies. This study shows that changes in terminology and framing can reduce stigma and improve public health communication. These insights support the need for proactive media guidance and the use of linguistic analysis to inform future policy, practice, and research in outbreak response and health communication.
Article activity feed
-
This Zenodo record is a permanently preserved version of a PREreview. You can view the complete PREreview at https://prereview.org/reviews/17452103.
Summary
This paper uses large-scale corpus linguistics methods, which includes approximately 2.3 billion words of British English news data, to examine changes in British media coverage during the two MPOX or formally Monkeypox outbreaks in 2022 and 2024. The authors compared the frequency and co-occurrence patterns of "monkeypox" and "mpox" in the scale of span = 5, logDice ≥ 2, minfreq ≥ 5 and manually reviewed a sample of up to 100 co-occurrence lines. As the result,while reports in 2022 frequently linked the spread to the GBMSM community, reports in 2024 adopted a more neutral tone, using impersonal pronouns such as "mpox spreads..." Figure 1 (page 5) shows a sharp decrease in the …
This Zenodo record is a permanently preserved version of a PREreview. You can view the complete PREreview at https://prereview.org/reviews/17452103.
Summary
This paper uses large-scale corpus linguistics methods, which includes approximately 2.3 billion words of British English news data, to examine changes in British media coverage during the two MPOX or formally Monkeypox outbreaks in 2022 and 2024. The authors compared the frequency and co-occurrence patterns of "monkeypox" and "mpox" in the scale of span = 5, logDice ≥ 2, minfreq ≥ 5 and manually reviewed a sample of up to 100 co-occurrence lines. As the result,while reports in 2022 frequently linked the spread to the GBMSM community, reports in 2024 adopted a more neutral tone, using impersonal pronouns such as "mpox spreads..." Figure 1 (page 5) shows a sharp decrease in the frequency of "monkeypox" after the WHO issued its naming recommendation in November 2022. In discussion part, the authors suggest this indicates that the WHO naming guidance "achieved its intended effect." The main contribution of this paper is to reveal changing trends in media discourse on infectious diseases using rigorous and reproducible computational linguistic methods. However, equating linguistic change with "decreased stigma" lacks direct evidence, in terms of word frequency and co-occurrence metrics only indicate changes at the linguistic level, not shifts in societal attitudes. The decline in monkeypox usage may also due to the fading of the epidemic and reduced media attention, rather than to the effects of naming itself. Strengths of the paper include innovative methods, high transparency, and strong relevance to public health communication. Limitations include the analysis of English-language media only, some over-extrapolation of conclusions, and a lack of sociological validation. Overall, the study strongly demonstrates a linguistic destigmatization trend in British media, but cannot definitively prove a reduction in societal stigma. Personally suggests that the authors should considered major revisions on the causal tone or in other words, provide alternative explanations, and clarify the scope of application and data boundaries in the Conclusions and Limitations sections.
Major Issues
1.Causation relationship not strong.
The article repeatedly attributes language changes to the WHO naming policy, but lacks empirical data on behavioral or attitudinal levels; the evidence is limited to covariation between word frequency and co-occurrence.
2. Alternative explanations not fully discussed.
The decline trend for "monkeypox" frequency may be due to the easing of the epidemic and decreased media attention, rather than an effect of the naming.
3. The authors collected data from English media only.
Since British society is multilingual, the results are not universally representative.
4. Not fully define the key concept about "stigma" in this study.
This article primarily infers "stigma" through linguistic indicators, for example, co-occurring words and grammatical agents, but the logical connection between these indicators and social attitudes is not clearly defined.
Minor Issues
1. Interpretation for terminology
Terms like "collocates" and "logDice" could be further explained for those who lacks statistics background.
2. Multiple language consideration
Could apply internationally citations related to different language and the stigma of Mpox in the discussion to enhance external context.
Section-by-Section Review
Abstract
The abstract provides accurate summary of the study's objectives, corpus-linguistic methods, and key results, describing shifts in British media language from stigmatizing to neutral framing between 2022 and 2024. It is concise and clearly written, but the causal phrasing "WHO naming had the intended impact" slightly overstates the evidence. Personally, this statement may better be rephrased as "media language became more neutral following WHO guidance." The abstract is otherwise engaging, but would benefit from one brief acknowledgment of limitations, for instance, namely that the dataset is English-only and that linguistic findings do not necessarily represent public attitudes in entire British community.
Introduction
The introduction effectively outlines the public-health context of Mpox, its renaming history, and the need to examine media representations. Novelty and significance are adequately highlighted through the emphasis on corpus methods applied to public-health communication.
Methods
The methods section is detailed and replicable, clearly describing corpus composition, collocation parameters, however, it is recommended to furtherly define the terminologies like "logDice".
Results
Figures 1 and 2 effectively illustrate term-frequency changes.The only interpretive concern is that decreased "monkeypox" frequency may partly reflect outbreak decline or media fatigue rather than purely linguistic reform. No additional experiments are necessary; acknowledging these contextual factors would strengthen the interpretation.
Discussion and Conclusion
The discussion interprets results coherently within the relevant field of health/communication research, aligning with existing work on HIV media stigma. The main issue is causal language, specifically, statements implying that WHO naming caused destigmatization should be softened to reflect correlation rather than proven effect. Limitations are mentioned but require further defined especially the English-only dataset, the indirect measurement of stigma, and potential confounding from media-interest cycles. Implications for this study is relevant to future media-language guidance.
References
The section for reference meets scholarly standards and supports the integrity of the manuscript and it is comprehensive, recent, and appropriately balanced between linguistic and public-health literature.
Competing interests
The author declares that they have no competing interests.
Use of Artificial Intelligence (AI)
The author declares that they did not use generative AI to come up with new ideas for their review.
-