AI Agent Prevalence and Data Quality Across Multiple Online Sample Providers

Andrew Gordon
David Rothschild
Felipe M. Affonso
Justin Sulik
David Hauser
Karine Pepin
Simon Jones

Read the full article

Discuss this preprint

Start a discussion What are Sciety discussions?

Listed in

This article is not in any list yet, why not save it to one of your lists.

Abstract

Online recruitment platforms have become the dominant infrastructure for behavioral research, yet data quality concerns have acquired new urgency with the emergence of large language models (LLMs). Recent work showing that LLM-based agents can complete surveys while evading standard quality checks has prompted alarm about synthetic respondents infiltrating samples at scale. However, demonstrating agent capability is not equivalent to demonstrating ecosystem-level deployment, and variation in quality among human respondents across platform types may be a more consequential threat. We address both questions in a single pre-registered study: (1) what is the actual prevalence of AI agents across platforms, and (2) how does human data quality vary across structural market segments? We recruited 5,200 respondents across 13 conditions from 10 platforms spanning direct first-party panels, hybrid networks, and marketplace aggregators. Agent detection employed an automated environment check achieving perfect discrimination in pilot testing, plus a secondary battery of six behavioral indicators. Human quality was assessed across seven behavioral dimensions alongside metadata including device type, ecosystem activity, and cost efficiency. Agent detections were concentrated almost exclusively on Amazon MTurk (11–16%), with all other platforms at or below 1%; detected responses showed profiles more consistent with traditional bots than LLM-based agents. Evidence of humans using LLMs to augment answers, particularly on open-ended or difficult items, was consistent with recent work assuming no deployed mitigation. Human data quality varied substantially by platform type, with direct panels outperforming hybrid platforms, which outperformed marketplace platforms, across nearly all measures, an effect several times larger than that of agents or LLM-augmentation. Cost-efficiency analyses revealed direct panels, despite higher nominal costs, were most economical once quality thresholds were applied. The field’s most pressing data quality challenge remains systematic variation in human respondent quality by platform type, not AI agent infiltration.

Version published to 10.31234/osf.io/pvdjr_v2 on OSF Preprints
Apr 1, 2026
Version published to 10.31234/osf.io/pvdjr_v1 on OSF Preprints
Apr 1, 2026

AI Agent Prevalence and Data Quality Across Multiple Online Sample Providers

This article has 7 authors:
1. Andrew Gordon
2. David Rothschild
3. Felipe M. Affonso
4. Justin Sulik
5. David Hauser
6. Karine Pepin
7. Simon Jones
This article has no evaluationsLatest version Apr 1, 2026
Brief Commentary: A Framework for Detecting AI Agents in Online Research

This article has 1 author:
1. Felipe M Affonso
This article has no evaluationsLatest version Mar 25, 2026
A scale for detecting LLM-generated responses in online survey research

This article has 2 authors:
1. Cameron Stuart Kay
2. Madalina Vlasceanu
This article has no evaluationsLatest version Mar 21, 2026

Discuss this preprint

Listed in

Abstract

Article activity feed

Related articles

AI Agent Prevalence and Data Quality Across Multiple Online Sample Providers

Brief Commentary: A Framework for Detecting AI Agents in Online Research

A scale for detecting LLM-generated responses in online survey research