Open Science interventions to improve reproducibility and replicability of research: a scoping review preprint

This article has been Reviewed by the following groups

Read the full article See related articles

Abstract

Various interventions – especially those related to open science – have been proposed to improve the reproducibility and replicability of scientific research. To assess whether and which interventions have been formally tested for their effectiveness in improving reproducibility and replicability, we conducted a scoping review of the literature on interventions to improve reproducibility. We systematically searched Medline, Embase, Web of Science, PsycINFO, Scopus and Eric, on August 18, 2023. Grey literature was requested from experts in the fields of reproducibility and open science. Any study empirically evaluating the effectiveness of interventions aimed at improving the reproducibility or replicability of scientific methods and findings was included. An intervention could be any action taken by either individual researchers or scientific institutions (e.g., research institutes, publishers and funders). We summarized the retrieved evidence narratively and in an evidence gap map. Of the 104 distinct studies we included, 15 directly measured the effect of an intervention on reproducibility or replicability, while the other research questions addressed a proxy outcome that might be expected to increase reproducibility or replicability, such as data sharing, methods transparency or preregistration. Thirty research questions within included studies were non-comparative and 27 were comparative but cross-sectional, precluding any causal inference. Possible limitations of our review may be the search and selection strategy, which was done by a large team including researchers from different disciplines and different expertise levels. Despite studies investigating a range of interventions and addressing various outcomes, our findings indicate that in general the evidence-base for which various interventions to improve reproducibility of research remains remarkably limited in many respects.

Article activity feed

  1. This Zenodo record is a permanently preserved version of a PREreview. You can view the complete PREreview at https://prereview.org/reviews/13149713.

    This review reflects comments and contributions from Nicolás Hinrichs, Saeed Shafiei Sabet, Melissa Chim, David Makoko, Chalermchai Rodsangiam, Martyn Rittman, Queen Saikia, Konstantinos Geles, Neeraja M Krishnan, Vanessa Bijak, and Stephen Gabrielson. Review synthesized by Stephen Gabrielson.

    The meta research piece explores factors for formal testing of efficacy of interventions that aim to improve reproducibility and replicability. Results are summarized in bubble charts and the authors mainly claim that the evidence-base of research seems limited across the board.

    Minor comments:

    • I appreciate the breadth of the overview on the matter that the maps provide, as well as the clarity in the portrayal of single issues, but I find the lack of metrics accompanying the claims worrying. These would need to be summarized, at the very least, so as to enable end-users of this kind of material; it is actually surprising, given how carefully the methods were designed.

    • Regarding the authorship, it would be good to be more precise about which subject areas they had expertise in, and hence (as a limitation of the study) where there were gaps in expertise. 

    • It would have been valuable to present the discipline percentage of the pre-screened studies cohort to assess the presence of potential bias in the screening procedure and/or the absence of intervention studies in these specific fields of study. Other than that the review is methodologically sound.

    • Perhaps the authors can explain how to interpret the evidence gap maps a bit more. For instance, it is not intuitive to me what causes overlapping bubbles versus two well-separated bubbles.

    • In regards to the section "Disciplinary scope of interventions", the map itself provides an amazing portrayal and even has a useful granularity on the quantification of the issues, so I'd expect metrics to sustain these specific ideas here (at least as an excerpt from the figure that follows).

    • Figure xx cited on page 43 is not accessible.

    • In the abstract, should the disclaimer for the protocol pre-registration and funder statement be included elsewhere in the manuscript?

    • When replicability is defined in the Introduction, 'replicability' is also defined in Supplementary File 6, as 'to obtain the same results for the same research question, using the same analytical method but on a different sample (European Commission 2020)'. I would suggest that the authors expand on the definition here as well.

    • In paragraph two of the Introduction, it would be beneficial to clarify the difference between the terms 'reproducibility' and 'replicability' to help better understand the meaning of this sentence: "A piece in Nature News in 2016 reported survey findings (ironically themselves lacking in rigour and transparency) that highlighted that between 60% and 80% of scientists across various disciplines encountered hurdles in reproducing the work of their peers, with similarly noteworthy difficulties encountered when attempting to replicate their own experiments (40% to 60%)". Did the scientists attempt to replicate their own experiments using alternate datasets or did they attempt to reproduce the results with the exact same data?

    • In paragraph three of the Introduction, I'd expect naming specific sets of principles known from public advocacy on the topic (e.g., FAIR from NASA) when referring to "various strategies and practices that have the potential to address the rigour and reliability of large areas of scholarly work", so as to ground the discussion with examples that can make this accessible to someone who has a specific interest in the matter.

    • In the last paragraph of the Introduction, I'd expect more than citing an opinion here (reference #44) and instead insert a couple of examples on which empiric data we do have at the moment and why it doesn't allow us to scaffold towards clarity.

    • There was discussion amongst reviewers about the placement of the "Outcomes" sub-section. Some thought that it should be right before the "Results" section because it interrupts the flow between "Study design" and "Search strategies". Others thought that the placement of the "Outcomes" section was OK, because from the sub-section "Development of the search query" the search terms would include interventions and outcomes. From this perspective, maybe that is why the "Outcomes" section was placed before "Search strategies"? A good point is made in the section "State of the evidence on effectiveness of interventions": "As many proposed interventions in this space may increase workload and costs for researchers, editors and other stakeholders, it is important that we know these resources are being spent on practices that are evidenced as being effective."

    • While the authors note that a limitation of this study was that most articles retrieved from their search were English, could they consider searching other publication sources that might have more of a global scope that could contain non-English language articles? Maybe Google Scholar, Dimensions, or preprint servers (SciELO, AfricArXiv, etc.)?

    Comments on reporting:

    • The use of preregistration and PRISMA-ScR gave me a lot of confidence in how this study was conducted.

    • The analyses were done with appropriate controls. The reporting was done comprehensively and all caveats were spelt out.

    Suggestions for future studies:

    • I like that the authors mention social sciences briefly, and I think that future research can focus on social sciences as a whole or even one field within that group

    • Something I am always curious about these kinds of studies: are they conducted by researchers with a background in meta-research? I often see studies done by researchers with domain experience, but not experience in researching open science. In these cases, I am tempted to think that they are veering more to the side of advocacy and likely to have weaker designs and suffer from confirmation bias, as suggested here.

    Competing interests

    The authors declare that they have no competing interests.