Prompt Volatility: An Empirical Study of Identity Drift in Large Language Model Agents

Read the full article See related articles

Discuss this preprint

Start a discussion What are Sciety discussions?

Listed in

This article is not in any list yet, why not save it to one of your lists.
Log in to save this article

Abstract

Large language model (LLM)–based agents are increasingly deployed with explicit role definitions and task constraints. However, the stability of such identities under adversarial or conflicting informational demands remains underexplored. In this work, we introduce prompt volatility as a measurable property of agent design, defined as the sensitivity of role adherence to the initial prompt specification. We present a minimal, reproducible benchmark that evaluates identity stability by subjecting agents to controlled perturbations, including role attacks, noise injections, and contradictory instructions. Using identical perturbation protocols, we compare two agents differing only in the strength of their initial prompt constraints. Empirical results show that weakly specified (junior) prompts exhibit significantly higher behavioral variance and reduced identity stability, while strongly constrained (senior) prompts remain invariant across runs. These findings demonstrate that prompt configuration functions as a critical initial condition for agent behavior, directly impacting robustness against role drift. The proposed benchmark provides a lightweight baseline for evaluating prompt volatility and offers a foundation for future studies on agent reliability, safety, and long-horizon behavior in LLM-based systems.

Article activity feed