Replication report for Marwick (2025) “Is archaeology a science?”, including new data from OpenAlex
This article has been Reviewed by the following groups
Discuss this preprint
Start a discussion What are Sciety discussions?Listed in
- Evaluated articles (Peer Community in Archaeology)
Abstract
This document is a reproduction and replication of the first part of Ben Marwick’s paper published in Journal of Archaeological Science which analyzes the hard/soft position of archaeology and the evolution through time by the proxy of bibliometric data (Marwick, 2025). I confirm the complete computational reproducibility of Marwick (2025) while also pointing to few problems in the manuscript. As for the replication of the study, while Marwick’s article is based on the analysis of the Web of Science data of archaeological journals and articles, I use the data from OpenAlex, a free and open-source database. The analysis of the data from OpenAlex confirms the trends visible in the replicated study for the position of the trends of publication in archaeological journals, for its evolution through time, and for the classification of different journals. Some differences are still visible, mainly since OpenAlex data is less influenced by recent trends in the publication process due to its more balanced data for the second half of the 20th century. This study also emphasizes that using the free and open source OpenAlex database is suitable for this kind of scientometric study instead of commercial databases.
Article activity feed
-
-
I want to thank the author for sharing this reproduction word, which all reviewers highlighted as a great improvement for the archaeological field. This kind of work needs to be put forward as it allows our discipline to win more "credibility". The work of Perreault (2019) was, for many of us, I am sure, a turning point in seeing how "weak" the scientificity of archaeological science can appear when reproducibility of the experiment is not possible. This work takes the opposite approach by building upon a previous experiment by B. Marwick.
That said, reviewers, while encouraging this approach, also have some comments on the preprint. While most of them discuss some choice on the method and the "discussion" with B. Marwick's papers, other minor comments have been raised on the code or other parts.
I would therefore, recommend …
I want to thank the author for sharing this reproduction word, which all reviewers highlighted as a great improvement for the archaeological field. This kind of work needs to be put forward as it allows our discipline to win more "credibility". The work of Perreault (2019) was, for many of us, I am sure, a turning point in seeing how "weak" the scientificity of archaeological science can appear when reproducibility of the experiment is not possible. This work takes the opposite approach by building upon a previous experiment by B. Marwick.
That said, reviewers, while encouraging this approach, also have some comments on the preprint. While most of them discuss some choice on the method and the "discussion" with B. Marwick's papers, other minor comments have been raised on the code or other parts.
I would therefore, recommend revisions of this preprint at its current state.
-
This is a really great paper, and it is very important in that it sets the standard for reproduction and replication studies in archaeological research. I especially appreciate the casual and discursive tone, which encourages iterative improvement of research practice rather than an adversarial and shame-inducing "science cop" mentality.
That being said, I do have some concerns regarding clarity, and specifically regarding disambiguation of the present work and the Marwick paper that it seeks to reproduce and replicate. It may be helpful to consider reproduction and replication papers as interpretive iterations of prior work, and as such the parameters through which interpretations are made need to be thoroughly and concretely articulated. While I was able to detect these throughout the thread of the manuscript, I think that it could be …
This is a really great paper, and it is very important in that it sets the standard for reproduction and replication studies in archaeological research. I especially appreciate the casual and discursive tone, which encourages iterative improvement of research practice rather than an adversarial and shame-inducing "science cop" mentality.
That being said, I do have some concerns regarding clarity, and specifically regarding disambiguation of the present work and the Marwick paper that it seeks to reproduce and replicate. It may be helpful to consider reproduction and replication papers as interpretive iterations of prior work, and as such the parameters through which interpretations are made need to be thoroughly and concretely articulated. While I was able to detect these throughout the thread of the manuscript, I think that it could be significantly improved through a few key changes:
- Stating more concrete objectives at the start, which will have an added benefit of scaffolding the manuscript's overall structure.
- Defining parameters for how success and failure to reproduce replicate will be determined.
- More concretely describing and interpreting the methods employed in the original study, and referring to these as "signposts".
- Providing a discrete overview of the different data sources prior to accounts of how they were used.
In other words, "taking ownership" of the study will help anchor it in a well-defined and relatively stable situated perspective that reflects your own position as reproducer and replicator. This can be supported through well-structured introduction and background sections, which set the stage for the reader by providing and framing the importance of all the information they need to know before they dive deeper in. For instance, comparison against Fanelli and Glanzer, and the differences in the data quality between WoS and OpenAlex, are kind of abrupt, and should be anticipated earlier on.
I believe that this will also support the production of clearer nomenclature, which may support cleaner coding practices. I think it's ok, and perhaps even necessary, to adapt/modify Marwick's original code to fit your own conceptual model that is more suited for your own purposes. To be clear, this is not really a substantive comment and I am not asking to refactor the code, but it illustrates what I mean by "taking ownership" of the prior work.
The findings are well supported, though see my comments below regarding their framing.
Please refer to my detailed in-line comments in the attached PDF.
Title and abstract
Does the title clearly reflect the content of the article?- [x] Yes
- [ ] No (please explain)
- [ ] I don’t knowDoes the abstract present the main findings of the study?
- [ ] Yes
- [x] No (please explain)
- [ ] I don’t know> Data from OpenAlex also confirms the trends visible in the replicated study both for the hard/soft science categorization, for the evolution through time and for the classification of different journals, with only some minor differences.
Need to briefly summarize what the minor differences are, or the kinds of differences, between the application of WoS and OpenAlex datasets.
> This work is based on a bibliometric analysis of the Web of Science data of archaeological journals and articles, when I use the data from OpenAlex, a free and open-source database instead.
This sentence is unclear. It needs to more effectively disambiguate each paper and their respective data sources.
Introduction
Are the research questions/hypotheses/predictions clearly presented?- [ ] Yes
- [x] No (please explain)
- [ ] I don’t knowThe motivations for doing this work are stated, but the paper could benefit from articulating the objectives in a more ordered fashion.
This should include a more concrete justification for testing the replicability using OpenAlex instead of WoC; it is not enough to simply state that one is open and the other is proprietary --- you also need to explain why this distinction is important.
Additionally, you need to more precisely indicate the parameters through which replication will be attempted ("The replication will thus apply the same methodology..." --- what aspects of the methodology? Your interpretation of the original methodology is important here.) and the parameters through which degree of success will be evaluated.
Some greater context is needed regarding the totality of Marwick's paper. You indicate that you focus on replicating the first part only, but it would also be helpful to include a few lines summarizing the rest of Marwick's paper (with reference to specific "sections") and the context in which that work was situated.
It should be noted that the Marwick paper actually constitutes meta-research, and is not actually archaeological research. Despite being squarely situated in the domain of archaeological practice, I would therefore hesitate to refer to this as the first reproduction and replication of an archaeological study.
Does the introduction build on relevant research in the field?
- [ ] Yes
- [ ] No (please explain)
- [ ] I don’t knowThe introduction could benefit from a very brief reference to the benefits and limitations of commercial and open bibliometric databases (as per my other comment above).
Materials and methods
Are the methods and analyses sufficiently detailed to allow replication by other researchers?
- [ ] Yes
- [ ] No (please explain)
- [ ] I don’t knowI failed to reproduce the code on my own laptop. I was unable to proceed past line 550 of `repro_marwick_OpenAlex.qmd` because the file `data/wos-data-df.rds` does not exist.
Overall, the code seems well documented, especially regarding the data preparation for the OpenAlex data. However, the code could benefit from more effective and consistent disambiguation between code created by Marwick and code adapted or modified in the reproduction. Specifically, I think that some clarification is warranted regarding whether, when, and/or how your own coding deviates from or iterates upon Marwick's original code. This comment reflects my underlying belief that a reproduction study is effectively an interpretation and that the parameters through which interpretations are made should be articulated as clearly as possible.
If applicable (for empirical studies), are sample sizes are clearly justified?
- [ ] Yes
- [ ] No (please explain)
- [ ] I don’t knowThere was a lot of thorough discussion regarding the comparability of data. However the extensive account of discrepancies between data sources could perhaps be clarified through a discrete comparison of WoS and OpenAlex prior to and separate from how the author and Marwick engage with them --- as a means of more effectively situating the cart before the horse. It's also a bit difficult to read Table 1 which addresses these issues; see my comments in the attached document for details.
Are the methods and statistical analyses appropriate and well described?
- [ ] Yes
- [ ] No (please explain)
- [ ] I don’t knowI am not statistically-gifted and I'm therefore not the best person to evaluate this aspect of the paper.
But the overall methodology is sound and comparisons with Marwick's original paper seem robust.Results
In the case of negative results, is there a statistical power analysis (or an adequate Bayesian analysis or equivalence testing)?
- [ ] Yes
- [ ] No (please explain)
- [x] I don’t knowI am not equipped to address this question due to my relative lack of statistical expertise.
Are the results described and interpreted correctly?
- [ ] Yes
- [ ] No (please explain)
- [ ] I don’t knowSection 4.1 is the first time you bring up Fanelli and Glanzel 2013, aside from very brief mentions of the basis of Marwick's paper. Some more background on their work would probably make it easier to compare their findings with both your own analysis and with Marwick's analysis --- I find myself re-reading those paragraphs many times, while also flipping back to the original sources, without fully grasping the outcome.
Both subsections (4.1 and 4.2) need some sort of summary, which could be as few as 1-2 lines for each.
Without these, it sometimes feels like I'm left hanging for the implications of the analysis. Leaving some breadcrumbs along the way to the deeper discussion will make the journey more satisfying. This also applies to the paragraph starting at line 311, where you present several overlaid observations without providing sufficient information regarding how they contribute to some greater effect.Line 318: The reasons for your surprise with some of the findings should be elaborated upon.
Discussion
The listing of paragraphs is unclear. What are "first", "second" etc meant to refer to? Is this referring to the intent to reproduce, and then to replicate using an alternative dataset? If so, this should be made more explicit.The paragraph starting at line 356 is way too long ad includes non-specific references and unclear statements. See my comments in the attached document for some more detailed feedback.
Have the authors appropriately emphasized the strengths and limitations of their study/theory/methods/argument?
- [ ] Yes
- [ ] No (please explain)
- [ ] I don’t knowIt's important that you define the parameters for what constitutes successful replication and reproduction prior to declaring success.
I would appreciate a more discrete account of the limitations of the study. This should be easier to implement if you provide a more thorough account of your positionality, as I suggested in my feedback for the introduction.
Are the conclusions adequately supported by the results (without overstating the implications of the findings)?
- [ ] Yes
- [ ] No (please explain)
- [ ] I don’t knowLines 361-364: While what you highlight in that sentence is true, wasn't there also closer alignment with social sciences in other variables? If this is a meaningful finding, it should also be accounted for here.
-
This paper attempts to evaluate the replicability and reproducibility of a previous study by Marwick (2025) which used bibliographic data to assess the hardness/softness of archaeology as a science. It does this by using a different bibliographic database to collate its source data: OpenAlex, in contrast to the original study which used Web of Science (WoS). It is also takes up the conversation on the question of reproducibility in archaeology which the original article also discussed, specifically computational reproducibility.
There are a number of points I would like to raise which I think the author should reflect on as I think they will help improve the presentation of the argument.
1. The framing of the basic aim and structure of the paper could be made clearer (as could the title and abstract which I found very confusing). …
This paper attempts to evaluate the replicability and reproducibility of a previous study by Marwick (2025) which used bibliographic data to assess the hardness/softness of archaeology as a science. It does this by using a different bibliographic database to collate its source data: OpenAlex, in contrast to the original study which used Web of Science (WoS). It is also takes up the conversation on the question of reproducibility in archaeology which the original article also discussed, specifically computational reproducibility.
There are a number of points I would like to raise which I think the author should reflect on as I think they will help improve the presentation of the argument.
1. The framing of the basic aim and structure of the paper could be made clearer (as could the title and abstract which I found very confusing). The paper starts with the distinction between replication and reproduction – this is important, but I felt it sometimes got lost as the paper went through the analysis as the two aspects were largely presented side by side. I almost think the paper would be clearer if they separated these two parts as different sections of text. For example, start with a section focusing solely on reproduction and the minor issues that emerged, and then move on to a section on replication. Because the results do have different implications – one concerns the reliability of a method, the other, the reliability of results generated by a method. As the study showed, the method was reliable, but using a slightly different method (ie different database), slightly different results emerge.
2. The paper is almost written as if personally addressed to Marwick, in the sense that it presupposes a lot of the reader. The background to Marwick’s paper is too brief and I had to go to the original paper to fully appreciate the context to this paper. I think the author needs to prep the reader more. There is also a lot of technical information about coding etc. which went over my head – the question is how many of the readers will also understand this? Could some be relegated to an appendix?
3. The choice of using OpenAlex in the replication part of the study is fine, but I think the discussion could have been broadened (see lines 58-77). One almost gets the impression there are only 2 choices of bibliographic databases – WoS and OA. What about Scopus or others? Everyone knows these different databases have their own biases towards certain disciplines and as this was clearly a factor in selecting OA, I think a few sentences to contextualize this more broadly would have been helpful.
4. There is also the broader issue of the core topics that drive both this paper and Marwick’s: the status of archaeology as a science and the replicability/reproducability in the sciences/archaeology. This paper largely echoes the position taken by Marwick here, but many readers will find problems with both of these points.
a. the replicability/reproducability in the sciences/archaeology – there is actually some literature on this issue within the philosophy of science that both papers seem to ignore (e.g, the work of Harry Collins). Some of this literature has long pointed out that replication is one of those myths about science that constantly gets repeated, so there is nothing new here.
b. The whole presumption of the distinction between hard and soft science is also somewhat a ‘pop culture’ view of the matter and even if it reflects the views of many practitioners, studies like this only perpetuate this view instead of helping to deconstruct it. Again, deeper engagement with work in the philosophy of science would not be amiss. And using bibliographic data on 5 variables (as Marwick does and as this paper reproduces) to mark this distinction could be seen as rather simplistic and reductive.
Despite my misgivings on these 2 issues, I am not suggesting the author re-write the whole paper; it would be enough to acknowledge them. For example, you can argue this paper both reproduces and largely replicates the results of Marwick’s study. But that is not the same as confirming that archaeology lies roughly midway between the hard and soft sciences; all it tells us is that in terms of these 5 bibliographic variables, archaeology publications fall between publications in physics and the humanities. What THAT means is open to all kinds of interpretations and the whole issue of hard vs soft sciences is to my mind, far more complex and problematic. Analyses like this only reinforce a stereotype rather than seriously open up an issue. Even if the author disagrees with my perspective on this, I think some acknowledgement of the complexity of the issue is still warranted, both in the introduction and conclusions to this paper.
Having said all this, I want to end by thanking the author for sharing this work which engages with the work of a colleague in such a close way. It showcases how much of science is a conversation with our peers and all too often, this aspect of our work is hidden from sight. And, as always, I learnt something new!
-
Please see the accompanying PDF.
-
-
-
-
-
