LLM-Assisted Replication as Scientific Infrastructure
Discuss this preprint
Start a discussion What are Sciety discussions?Listed in
This article is not in any list yet, why not save it to one of your lists.Abstract
Large language models (LLMs) are rapidly accelerating scientific production, from literature synthesis to automated analysis. Yet this expansion risks creating a verification gap, in which the volume of scientific claims outpaces the community's capacity to check their reproducibility. We argue that the same LLM capabilities driving scientific output can be redirected toward scalable verification. As a concrete example, we demonstrate how an autonomous LLM-based agent can reproduce the core statistical results of a classic sociology paper, while identifying underspecified methodological details. This means that automated replication does not adjudicate scientific truth, but it localizes discrepancies and documentation gaps, lowering the cost of computational reproducibility checks. Thereby, we propose embedding LLM-assisted replication across the research lifecycle, such as pre-submission quality check, journal-integrated verification, post-publication audits, and forensic reconstruction of legacy studies. To prevent misuse and preserve trust, we call for transparent standards and community governance. If institutionalized responsibly, AI can serve not only to generate science, but to scale its self-correction.