Simulated Selfhood in LLMs: A Behavioral Analysis of Introspective Coherence

Jose Augusto de Lima Prestes

Read the full article

Discuss this preprint

Start a discussion What are Sciety discussions?

Listed in

This article is not in any list yet, why not save it to one of your lists.

Abstract

Large Language Models (LLMs) increasingly generate outputs that resemble introspection, including self-reference, epistemic modulation, and claims about their internal states. This study investigates whether such behaviors reflect consistent, underlying patterns or are merely surface-level generative artifacts.We evaluated five open-weight, stateless LLMs using a structured battery of 21 introspective prompts, each repeated ten times to yield 1,050 completions. These outputs were analyzed across four behavioral dimensions: surface-level similarity (token overlap via SequenceMatcher), semantic coherence (Sentence-BERT embeddings), inferential consistency (Natural Language Inference with a RoBERTa-large model), and diachronic continuity (stability across prompt repetitions). Although some models exhibited thematic stability, particularly on prompts concerning identity and consciousness, no model sustained a consistent self-representation over time. High contradiction rates emerged from a tension between mechanistic disclaimers and anthropomorphic phrasing. Following recent behavioral frameworks, we heuristically adopt the term pseudo-consciousness to describe structured yet non-experiential self-referential output in LLMs. This usage reflects a functionalist stance that avoids ontological commitments, focusing instead on behavioral regularities interpretable through Dennett’s intentional stance. The study contributes a reproducible framework for evaluating simulated introspection in LLMs and offers a graded taxonomy for classifying such reflexive output. Our findings carry significant implications for LLM interpretability, alignment, and user perception, highlighting the need for caution when attributing mental states to stateless generative systems based on linguistic fluency alone.

Version published to 10.31234/osf.io/qkrcz_v3 on OSF Preprints
Jul 26, 2025
Version published to 10.31234/osf.io/qkrcz_v2 on OSF Preprints
Jul 26, 2025
Version published to 10.31234/osf.io/qkrcz_v1 on OSF Preprints
Apr 3, 2025

Discuss this preprint

Listed in

Abstract

Article activity feed