Machine Theory of Mind and How Large Language Models Mimic Human Mind Perception but Mask Representational Divergence
Discuss this preprint
Start a discussion What are Sciety discussions?Listed in
This article is not in any list yet, why not save it to one of your lists.Abstract
As LLMs increasingly mediate decisions, recommendations, and persuasive communication, a central question is whether their mind-attribution structure in output space is human-like or only behaviorally similar. This creates a behavior–geometry paradox: systems can produce human-like responses while relying on a latent structure that differs from human social cognition. We address this gap with a large-scale repeated-elicitation design spanning 16 agents, 65 mental capacities, and 178 replications (185,120 judgments), followed by dimensional analyses and benchmark comparisons with established human mind-perception frameworks. We report three findings. First, model judgments recover a recognizable human-like backbone of mind ascription. Second, this backbone is systematically compressed, with weaker separation among capacities and reduced structural contrast. Third, key dimensions are reconfigured under broader probes, including shifts in how affective, moral-mental, and reality-interaction capacities are partitioned. Together, these results show that behavioral alignment can coexist with representational divergence. The findings support extending safety and alignment evaluation beyond response-level performance to include audits of output-inferred representational geometry, with implications for governance, deployment in persuasive contexts, and theory-building at the intersection of AI and social cognition.