Emergent Non-Classical Probabilistic Structure in Large Language Models Under Contextual Modulations
Discuss this preprint
Start a discussion What are Sciety discussions?Listed in
This article is not in any list yet, why not save it to one of your lists.Abstract
Classical probability theory provides the standard foundation for rational inference, yet human probability judgments are known to systematically violate Kolmogorovian axioms through phenomena such as conjunction fallacies and question-order effects. Whether large language models (LLMs) exhibit structurally analogous departures from classical probability remains an important empirical question with broad implications for AI evaluation and cognitive science. Here, we report a controlled experimental investigation of the probabilistic structure of LLM responses under systematically varied contextual prompts, treating model outputs as behavioral observables of a high-dimensional inference system. Using paradigms drawn from cognitive decision research, we evaluated multiple LLMs across tasks, including ambiguity judgments, belief revision, order effects, conjunction fallacies, and base-rate reasoning. Our experiments reveal systematic violations of the law of total probability accompanied by pronounced order-dependent effects across all models. Critically, these deviations are not random: they exhibit a structured interference form characterized by a phase-dependent term and a non-trivial upper bound, and the observed scaling relation demonstrates that the resulting probability assignments cannot be embedded within any single Kolmogorov probability space. These findings are consistent with a constrained non-commutative probabilistic structure, formally analogous to quantum probability models previously proposed for human cognition, without implying that neural networks implement quantum-physical processes. Our results establish non-commutative probability as a principled descriptive framework for contextual inference in large-scale artificial systems, and highlight the need for structural, beyond accuracy-based evaluation of probabilistic reasoning in modern AI.