Prompt Volatility: An Empirical Study of Identity Drift in Large Language Model Agents
Discuss this preprint
Start a discussion What are Sciety discussions?Listed in
This article is not in any list yet, why not save it to one of your lists.Abstract
Large language model (LLM)–based agents are increasingly deployed with explicit role definitions and task constraints. However, the stability of such identities under adversarial or conflicting informational demands remains underexplored. In this work, we introduce prompt volatility as a measurable property of agent design, defined as the sensitivity of role adherence to the initial prompt specification. We present a minimal, reproducible benchmark that evaluates identity stability by subjecting agents to controlled perturbations, including role attacks, noise injections, and contradictory instructions. Using identical perturbation protocols, we compare two agents differing only in the strength of their initial prompt constraints. Empirical results show that weakly specified (junior) prompts exhibit significantly higher behavioral variance and reduced identity stability, while strongly constrained (senior) prompts remain invariant across runs. These findings demonstrate that prompt configuration functions as a critical initial condition for agent behavior, directly impacting robustness against role drift. The proposed benchmark provides a lightweight baseline for evaluating prompt volatility and offers a foundation for future studies on agent reliability, safety, and long-horizon behavior in LLM-based systems.