Synthetic personas distort the structure of human belief systems

Christopher Barrie
Roberto Cerina

Read the full article

Discuss this preprint

Start a discussion What are Sciety discussions?

Listed in

This article is not in any list yet, why not save it to one of your lists.

Abstract

Large language models (LLMs) are increasingly used as synthetic survey respondents, yet it is unclear whether their belief-system structure matches that of real publics. We compare 28 LLMs to the 2024 General Social Survey (GSS) using 52 attitude items and demographic persona traits. We estimate polychoric correlation matrices and propagate uncertainty in the GSS via bootstrap resampling with multiple imputation. Constraint is measured by the variance share explained by the first principal component and by effective dependence, a determinant-based measure of global linear dependence. Across models, LLM personas exhibit substantially higher constraint than humans; conditioning on persona traits reduces constraint far more for LLMs, indicating greater demographic mediation. Projection onto a shared GSS basis further shows overemphasis of the leading dimension and missing secondary structure. These results caution against treating LLM personas as a reliable foundation for synthetic survey data generation.

Version published to 10.31235/osf.io/n7fq8_v1 on OSF Preprints
Feb 26, 2026

Psychology’s Missing Matter

This article has 1 author:
1. Oulmann Zerhouni
This article has no evaluationsLatest version Apr 7, 2026
Manipulating Subjective Socioeconomic Status and its Downstream (non) Effects: Two Direct Replications with Extensions

This article has 3 authors:
1. Lily Green
2. Samantha James-Brown
3. John Protzko
This article has no evaluationsLatest version Apr 17, 2026
How credible is the experimental evidence on precarious manhood? A z-curve analysis

This article has 2 authors:
1. Vladislav Krivoshchekov
2. Olga Gulevich
This article has no evaluationsLatest version Apr 9, 2026

Discuss this preprint

Listed in

Abstract

Article activity feed

Related articles

Psychology’s Missing Matter

Manipulating Subjective Socioeconomic Status and its Downstream (non) Effects: Two Direct Replications with Extensions

How credible is the experimental evidence on precarious manhood? A z-curve analysis