The threat of synthetic respondents extends to clinical mental health screening
Discuss this preprint
Start a discussion What are Sciety discussions?Listed in
This article is not in any list yet, why not save it to one of your lists.Abstract
Online platforms have enabled clinical mental health research at unprecedented scale, yet large language models (LLMs) now threaten the integrity of remotely collected data. Recent work has demonstrated that LLM-based synthetic respondents can complete general surveys with human-like plausibility, but the vulnerability of validated clinical psychiatric instruments remains unexamined. Here, we show that a commercially available LLM, given only brief diagnostic persona descriptions, produces clinically differentiated responses across seven standardized psychiatric screening instruments spanning mood, anxiety, obsessive-compulsive, trauma, psychotic, neurodegenerative, and eating disorders. In a simulation of 2106 synthetic personas, diagnosis-congruent personas generated scores exceeding clinical cutoffs on five of seven instruments, with scores scaling monotonically with assigned severity. This mimicry required no specialized configuration. Coherent symptom endorsement can no longer serve as a proxy for authentic participation, urgently necessitating standardized verification methods for clinical online data.