Formalized scientific methodology enables rigorous AI-conducted research across domains

Read the full article See related articles

Discuss this preprint

Start a discussion What are Sciety discussions?

Listed in

This article is not in any list yet, why not save it to one of your lists.
Log in to save this article

Abstract

We formalize scientific methodology---the end-to-end process from question formulation to evidence-grounded writing---as a phase-gated research protocol with explicit return paths and persistent constraints, and instantiate it for general-purpose language models as executable protocol specifications. The formalization decomposes methodology into three complementary layers: a procedural workflow, an integrity discipline, and project governance. Encoded as protocol specifications and activated across the lifecycle, these constraints externalize planning and verification artifacts and make integrity-relevant interventions auditable. We validate the approach in six end-to-end projects, including a matched controlled study, where the same agent produced two complete papers with and without the protocol. Across domains, the protocol-constrained agent produced evidence-backed, auditable research outputs, including closed-form derivations, quantitative ablations that resolve modeling design choices, and algorithmic refactors that preserve the objective while changing the computational primitive. In population-genomic applications, it also recovered well-studied biological signals as validity checks, including known admixture targets in the 1000 Genomes Project and Neanderthal-introgressed immune loci on chromosome 21 consistent with prior catalogs. In the controlled study, the protocol-free baseline could still produce a complete manuscript, but integrity-relevant risks were easier to introduce and harder to detect when constraints and artifacts were absent.

Article activity feed