Can Large Language Models Substitute Participant-Based Survey Studies?
Discuss this preprint
Start a discussion What are Sciety discussions?Listed in
This article is not in any list yet, why not save it to one of your lists.Abstract
Survey studies are a cornerstone of psychological research. Declining response rates make it increasingly difficult to recruit representative samples. Advances in large language models offer a potential solution: simulating human responses. This study assesses the effectiveness of four alternative prompt techniques in simulating human responses using ChatGPT-4o: chain-of-thought prompting (step-by-step reasoning), role-based prompting (demographic framing), bias-mitigation prompting (skeptical stance to reduce bias), and N-shot learning (using human examples to guide responses). Survey questions were designed to capture different constructs across different contexts and include both text and images. Results show that prompt design significantly affects the ability of large language models to provide simulated responses that align with those of human responses. Chain-of-thought and role-based prompting performed poorly. Bias-mitigation prompts improved alignment with human responses. N-shot learning consistently outperformed the other prompt designs, generating responses that closely mirrored those of human survey participants. These findings position ChatGPT-4o with N-shot learning as a valid, low-cost, and scalable complement to traditional participant-based survey studies.