Enabling preprint discovery, evaluation, and analysis with Europe PMC
Listed in
- Preprint review and curation (mark2d2)
Abstract
Preprints provide an indispensable tool for rapid and open communication of early research findings. Preprints can also be revised and improved based on scientific commentary uncoupled from journal-organised peer review. The uptake of preprints in the life sciences has increased significantly in recent years, especially during the COVID-19 pandemic, when immediate access to research findings became crucial to address the global health emergency.
With ongoing expansion of new preprint servers, improving discoverability of preprints is a necessary step to facilitate wider sharing of the science reported in preprints. To address the challenges of preprint visibility and reuse, Europe PMC, an open database of life science literature, began indexing preprint abstracts and metadata from several platforms in July 2018. Since then, Europe PMC has continued to increase coverage through addition of new servers, and expanded its preprint initiative to include the full text of preprints related to COVID-19 in July 2020 and then the full text of preprints supported by the Europe PMC funder consortium in April 2022. The preprint collection can be searched via the website and programmatically, with abstracts and the open access full text of COVID-19 and Europe PMC funder preprint subsets available for bulk download in a standard machine-readable JATS XML format. This enables automated information extraction for large-scale analyses of the preprint corpus, accelerating scientific research of the preprint literature itself.
This publication describes steps taken to build trust, improve discoverability, and support reuse of life science preprints in Europe PMC. Here we discuss the benefits of indexing preprints alongside peer-reviewed publications, and challenges associated with this process.