Bi-Predictability: A Real-Time Signal for Monitoring LLM Interaction Integrity
Discuss this preprint
Start a discussion What are Sciety discussions?Listed in
This article is not in any list yet, why not save it to one of your lists.Abstract
Large language models (LLMs) are increasingly deployed in multi-turn workflows where reliability depends on maintaining interaction integrity over time. Current evaluation methods are poorly matched to this setting: judge-based systems are post hoc and costly, while token-level measures such as perplexity capture output uncertainty but not whether the interaction remains structurally coupled. Here we show that interaction integrity can be monitored continuously using bi-predictability (𝑃), an information-theoretic measure computed from token-frequency statistics across the context-response-next-prompt loop. We operationalize 𝑃 through the Information Digital Twin (IDT), a lightweight architecture that estimates coupling from the observable token stream alone, without embeddings, auxiliary evaluators, or access to model internals. Across 4,500 turns between one student model and three frontier teacher models, the IDT detected all tested perturbations, including contradictions, topic shifts, and non-sequiturs, with 100% sensitivity, matching costlier methods at a fraction of the overhead. Structural coupling and semantic quality proved empirically separable: 𝑃 aligned with structural consistency in 85% of conditions but with semantic scores in only 44%, revealing a regime of silent uncoupling in which responses remain strong while interaction integrity degrades. These results establish 𝑃 as a practical, low-cost, real-time drift monitoring signal and suggest that structural and semantic evaluation should serve as complementary layers in reliable LLM deployment