Assessment of transparency indicators across the biomedical literature: How open is open?

This article has been Reviewed by the following groups

Read the full article

Abstract

Recent concerns about the reproducibility of science have led to several calls for more open and transparent research practices and for the monitoring of potential improvements over time. However, with tens of thousands of new biomedical articles published per week, manually mapping and monitoring changes in transparency is unrealistic. We present an open-source, automated approach to identify 5 indicators of transparency (data sharing, code sharing, conflicts of interest disclosures, funding disclosures, and protocol registration) and apply it across the entire open access biomedical literature of 2.75 million articles on PubMed Central (PMC). Our results indicate remarkable improvements in some (e.g., conflict of interest [COI] disclosures and funding disclosures), but not other (e.g., protocol registration and code sharing) areas of transparency over time, and map transparency across fields of science, countries, journals, and publishers. This work has enabled the creation of a large, integrated, and openly available database to expedite further efforts to monitor, understand, and promote transparency and reproducibility in science.

Article activity feed

  1. SciScore for 10.1101/2020.10.30.361618: (What is this?)

    Please note, not all rigor criteria are appropriate for all manuscripts.

    Table 1: Rigor

    NIH rigor criteria are not applicable to paper type.

    Table 2: Resources

    Software and Algorithms
    SentencesResources
    All algorithms were constructed to take a PDF file of a typical article available on PubMed and output (a) whether it has identified a statement of relevance, (b) why it is of relevance and (c) what the exact phrase of relevance is.
    PubMed
    suggested: (PubMed, RRID:SCR_004846)
    Data were standardised to the most common label to start with (i.e. if “BMJ” occurred more commonly than “British Medical Journal”, then the two were standardised into “BMJ”), apart from countries, which were standardised to the name given in ggplot2 [44], which is an R package that we used to create maps.
    ggplot2
    suggested: (ggplot2, RRID:SCR_014601)

    Results from OddPub: Thank you for sharing your code and data.


    Results from LimitationRecognizer: We detected the following sentences addressing limitations in the study:

    Limitations: First, the performance of our algorithms was assessed in a recent sample of PMC. As such, we cannot be certain of their performance across a sample of articles outside PMC, older articles, or articles published in the future. Similarly, we cannot be certain that average performance is representative of performance within specific fields of science, journals or publishers. However, the majority of articles in the published literature have been published after 2000 and the findings of our algorithms corroborate those seen in the current and previous manual assessments [20,22]. Second, we are only using the open access subset of PMC. There are other freely available articles within as well as outside PMC that we did not assess. However, previous publications have illustrated that the Open Access Subset is of similar composition as the non-Open Access Subset of PMC [22], and PMC represents about half of the recently published literature available on PubMed. Third, algorithms in general do not guarantee that the statements identified are either accurate, full, or true. This is why we have designed our algorithms to not only extract text, but also indicate why an article was labelled as including information on an indicator. We hope that future work will study these questions further. Finally, some of the XML versions used in our biomedical open literature-wide search do not include the COI or Funding disclosures that were included in the published PDF. Nevertheless, g...


    Results from TrialIdentifier: No clinical trial numbers were referenced.


    Results from Barzooka: We did not find any issues relating to the usage of bar graphs.


    Results from JetFighter: We did not find any issues relating to colormaps.


    About SciScore

    SciScore is an automated tool that is designed to assist expert reviewers by finding and presenting formulaic information scattered throughout a paper in a standard, easy to digest format. SciScore checks for the presence and correctness of RRIDs (research resource identifiers), and for rigor criteria such as sex and investigator blinding. For details on the theoretical underpinning of rigor criteria and the tools shown here, including references cited, please follow this link.