Unreviewed science in the news: The evolution of preprint media coverage from 2014–2021

Abstract

It has been argued that preprint coverage during the COVID-19 pandemic constituted a paradigm shift in journalism norms and practices. This study examines whether and in what ways this is the case using a sample of 11,538 preprints posted on four preprint servers—bioRxiv, medRxiv, arXiv, and SSRN—that received coverage in 94 English-language media outlets between 2014 and 2021. We compared mentions of these preprints with mentions of a comparison sample of 397,446 peer-reviewed research articles indexed in the Web of Science to identify changes in the share of media coverage that mentioned preprints before and during the pandemic. We found that preprint media coverage increased at a slow but steady rate prepandemic, then spiked dramatically. This increase applied only to COVID-19-related preprints, with minimal change in coverage of preprints on other topics. The rise in preprint coverage was most pronounced among health and medicine-focused media outlets, which barely covered preprints before the pandemic but mentioned more COVID-19 preprints than outlets focused on any other topic. These results suggest that the growth in coverage of preprints seen during the pandemic may imply only a temporary shift in journalistic norms, including a changing outlook on reporting preliminary, unvetted research.

This Zenodo record is a permanently preserved version of a PREreview. You can view the complete PREreview at https://prereview.org/reviews/8232591.

This review reflects comments and contributions from Melissa Chim, Philip N. Cohen, Martyn Rittman, Stephen Gabrielson, Yueh Cho, Jonny Coates, Nicolás Hinrichs, and Kim Powell. Review synthesized by Stephen Gabrielson.

This study argues that media coverage of preprints saw an unprecedented increase due to Covid-19. The authors give a thorough review of preprint coverage pre-pandemic, and demonstrate the paradigm shift in media coverage during the pandemic. This manuscript carefully discusses the trend of preprints through Covid-19 era, and summarizes the media coverage of these preprints shifting among different sources.

Major comments:

I am very impressed by the breadth of research these authors performed. I also appreciate the universality of their writing-although it focuses on science, I can see academics in the Humanities be inspired by this type of research. Thank you for sharing this work.
I appreciated that the authors systematically addressed this phenomenon with consideration of the time of preprint posting, preprint servers, topic correlation to Covid-19 for the preprinted side, and the professional specialty of reporters among the different media outlets to cover preprints from 2014-2021. In addition, in this post-Covid-19 era, we paid more attention to the quality of preprints, like the manuscripts submitted to the journal, and the quality of media coverage to share the contents of preprints to the public.
This is a very interesting article. The authors have systematically organized the various components of their work giving it a good flow.
The authors' meticulous methodology is remarkable, considering all relevant factors and variables included in a study on this topic, amounting to a considerable contribution towards preprint culture.

Minor comments:

The definition of "publication" could be clearer. Eg: "publication date within seven days of the published version's publication date." I note that posting a preprint is "publishing" it. It would be helpful to clarify terms and specify when you're referring to journal publication versus preprint publication.
It is unclear to me how the corpus of WoS articles were selected? All articles mentioned in the 94 pubs? Limited to certain disciplines? Need a little more explanation.
In statistical reporting, the numerators and denominators are not clear in the term "share", eg Table 6. Maybe there are discipline differences, but I (sociologist) expect "shares" to sum to 100%. The table note just uses "share" without specifying numerator and denominator.
I wonder if the authors explored expanding their preprint data source from the four preprint servers mentioned in their study, to also include EuropePMC or perhaps Research Square, which is one of the eligible preprint servers under the NIH Preprint Pilot. Doing some quick checking, there are COVID-19 preprints there. So I think more discussion on their reasoning of selecting preprint servers would be helpful.
I would like to see more discussion about the possibility of changes in behavior by authors and preprint servers. Given the urgency during the pandemic, more news-worthy preprints were probably being posted and a new set of high-profile authors who hadn't posted preprints before were starting to, which would increase the proportion of preprint mentions. At the same time, preprints servers were probably becoming more conservative with their evaluations of preprints and not posting lower quality studies due to the possibility of a negative impact — I was in this situation and became more cautious. These are certainly smaller effects than the choices made by journalists and possible drivers for their change in behavior, but seem worthy of discussion.
I'd like to see data comparing larger organisations with smaller outlets - the expertise/presence of science-focussed journalists likely differs and may impact the amount of coverage. I'd also like to see political leanings of the organisations and if this impacted coverage, though this is by no means required for this preprint.
The authors should highlight confounding variables to their study especially in addressing RQ1. Generally, COVID-19 articles would be trending during a pandemic hence journalist norms could have been influenced by the expectation of their audience. Is it possible to clarify that such confounding variables did not affect the conclusion of the study?
I'd appreciate incorporating more granular distinctions between types of institutions or groupings of entities that authors are affiliated with as having particular -perhaps idiosyncratic, even- stances and/or norms.
I would like to know why the term "News 52" is highlighted in the keywords section.
In the introduction section on line 66, I think it would be worthwhile to mention that Altmetric data relies on DOI tracking.
On line 67, would this also be an appropriate place to mention that bioRxiv was founded in 2013, and medRxiv not until 2019?
On line 71, arXiv launched in 1991, and the manuscript mentions MedRxiv launched in 2019 at line 281. The founding of bioRxiv and SSRN are not mentioned. This could be useful information to include in the Introduction (and mentioning MedRxiv founding earlier).
Beginning on line 186, the authors discuss how they filtered news outlets that covered a high volume of research. I do wonder if there is a difference between those outlets and outlets that cover smaller volumes of research? One may expect the bigger outlets to be better equipped at dealing with preprints. The cut off of 100 mentions per year seems arbitrary. How many outlets were identified overall? What percentage of the total outlets does this represent?
On line 220, how were reliable publication dates for each server assessed?
On line 438, the authors discuss how news media drew on particular servers based on their interest. Could it also be explained by a change in behaviour of authors: that they were sending more news-worthy preprints to bioRxiv/medRxiv?
Media and journalism content is generally driven by the audience. Could this have been a confounding variable that could have contributed to the shift in journalism norm? During the COVID-19 pandemic, information on COVID-19 was "selling" hence could have been a drivingnfluence for the paradigm shift?
For section 4.3, I wonder about how coverage was different or not for the types of COVID article - e.g. immunology v epidemiology etc.
On line 469, it may be helpful to break down the outlets based on political leaning too - did right wing outlets cover less science than left-wing outlets? I do remember a study addressing a similar question.
Beginning on line 502, has this shift in journalistic norms and practices continued? The authors may be able to comment on that without requiring additional data.
On line 572, the authors do a good job of clearly stating the limitations of their work.

Comments on reporting:

On line 250, the exact date of Web of Science data collection would be good to know.

Suggestions for future studies:

Beginning on line 548, I appreciate this nod to future studies here. The authors do an excellent job throughout incorporating philosophy into their research.

Competing interests

The author declares that they have no competing interests.

Read the original source

Unreviewed science in the news: The evolution of preprint media coverage from 2014–2021

This article has been Reviewed by the following groups

Listed in

Abstract

Article activity feed

An overview of open science in eco-evo research and the publisher effect.

The delicate balance between addressing Covid-19 misinformation and suppressing valid viewpoints

The State of Global Catastrophic Risk Research: A Bibliometric Review

This article has been Reviewed by the following groups

Listed in

Abstract

Article activity feed

Related articles

An overview of open science in eco-evo research and the publisher effect.

The delicate balance between addressing Covid-19 misinformation and suppressing valid viewpoints

The State of Global Catastrophic Risk Research: A Bibliometric Review