From Error to Estimation - Towards a Geometric Understanding of Hallucination

Read the full article See related articles

Discuss this preprint

Start a discussion What are Sciety discussions?

Listed in

This article is not in any list yet, why not save it to one of your lists.
Log in to save this article

Abstract

Hallucination in large language models has traditionally been viewed as a correctible error arising from imperfect training. We propose a fundamental reframing: Hallucination may not be a defect but rather a lawful geometric phenomenon emerging from semantic manifold curvature. By integrating information theory, differential geometry, and empirical analysis, we demonstrate that hallucination rate (h) systematically increases with representational curvature (κ), independent of information density (ρ). Using GPT-2 across five reasoning domains (n=1,500), we find a consistent correlation r(κ,h)=0.83 (p≈0.08), providing preliminary evidence that hallucination arises predictably where meaning-space bends most sharply. This transforms hallucination from mystery to measurement—not an error to suppress, but a quantifiable cost of compressing knowledge into finite linguistic form. We introduce the curvature–hallucination relationship h∝f(κ), unifying the manifold hypothesis in representation learning, information-geometry trade-offs, and distortion under abstraction. This framework enables a shift from error correction to geometric control: monitoring curvature to predict and regulate factual instability. What began as a reliability problem may become the first quantitative law linking representation, compression, and truth—the geometry of understanding itself.All Tier 1 experimental code, datasets, and metrics are openly released at the GitHub repository https://github.com/enkiluv/hallucination_curvature_experiment, inviting the research community to replicate, extend, and test whether this relationship generalizes across architectures, languages, and modalities in forthcoming Tier 2 and Tier 3 studies.

Article activity feed