AI in the Experimental Loop: Implications for Replicability in Social Sciences

Antonio Aquino
Fabio Aurelio D'Asaro
Rocco Gaudenzi
Marco Lezcano
Vittorio Iacovella
Michela Vezzoli
Michele Scandola

Read the full article

Discuss this preprint

Start a discussion What are Sciety discussions?

Listed in

This article is not in any list yet, why not save it to one of your lists.

Abstract

Integrating generative Artificial Intelligence (AI), in particular Large Language Models, into experimental social sciences presents both powerful opportunities and significant methodological challenges. One major issue is the stochastic nature of the output of those models, which complicates replicability—a foundational principle of scientific research. This paper examines the transition from the traditional Experiment-Subjects Dyad (ESD) experimental design to what we refer to as an Experiment-AI-Subjects Triad (EAIST) experimental design, where AI is employed to generate experimental trials. In the EAIST experimental design, AI can provide adaptive stimuli and generative systems, which may undermine experimental control and threaten the reliability of Human-Robot Interaction studies. We review specific examples, such as emotionally expressive chatbots and GAN-generated facial expressions, and identify the two sources of variability they introduce. We then propose a framework to enhance replicability in AI-driven research, drawing on principles from the Open Science movement. Key strategies include modular testing, parameter fixation, structured prompt engineering, and robust experimental design. We provide a checklist to ensure robustness and replicability of the EAIST experiments. Our recommendations aim to maintain the ecological advantages of AI while reinforcing methodological transparency and scientific rigour.

Version published to 10.31219/osf.io/69ua4_v2 on OSF Preprints
Mar 31, 2026
Version published to 10.31219/osf.io/69ua4_v1 on OSF Preprints
Jul 30, 2025

When AI Meets Psychology: Current Practices, Methodological Pitfalls, and the PREACT Framework

This article has 7 authors:
1. Mianzhi Hu
2. Xinyu Hu
3. Shaocong Xie
4. Ben D. Ulicnik
5. Emma Dennin
6. Connie Yang
7. Darrell A. Worthy
This article has no evaluationsLatest version Mar 27, 2026
Substratism: Conceptualizing and Measuring Moral Bias Against AI

This article has 5 authors:
1. Ali Ladak
2. Janet V.T. Pauketat
3. Jacy Reese Anthis
4. Steve Loughnan
5. Matti Wilks
This article has no evaluationsLatest version Mar 27, 2026
An AI agent can complete the Attention Network Test with human-like behavioral signatures: Implications for the bot-or-not debate

This article has 4 authors:
1. Richard Huskey
2. Ziyu Zhao
3. Douglas A. Parry
4. Jacob T. Fisher
This article has no evaluationsLatest version Mar 27, 2026

Discuss this preprint

Listed in

Abstract

Article activity feed

Related articles

When AI Meets Psychology: Current Practices, Methodological Pitfalls, and the PREACT Framework

Substratism: Conceptualizing and Measuring Moral Bias Against AI

An AI agent can complete the Attention Network Test with human-like behavioral signatures: Implications for the bot-or-not debate