Regret Is Weighted Forgetting

Read the full article See related articles

Discuss this preprint

Start a discussion What are Sciety discussions?

Listed in

This article is not in any list yet, why not save it to one of your lists.
Log in to save this article

Abstract

How much of an agent's regret comes from a bad representation, and how much from a bad policy? This paper gives an exact answer. For a fixed representation M and a finite evaluation distribution over history-test pairs, the minimum average normalized regret over all M-based policies equals the minimum margin-weighted deletion cost needed to make the optimal bet single-valued on each representation-test cell (M(h),T). A policy-wise decomposition then splits any actual policy's regret into irreducible aliasing cost plus avoidable within-cell misreporting. A Stack-Theoretic reformulation identifies the same quantity as a deficit in weighted weakness on a lifted task constructed from the evaluation support (where weakness is normally the degree to which a policy leaves open unseen diagnostic continuations). I use the identity to derive several direct corollaries, including a representation-convergence theorem in pure RL language, a regret-based partial order on abstractions, Lipschitz stability of K_ρ under margin estimation error, and connections to free energy and multi-agent coordination. A cross-framework corollary converts the regret floor into a generalisation probability. Under the canonical independent prior, the optimal M-based policy generalises with probability exp(-K_ρ(M)). The multi-class generalisation to K>2 diagnostic outcomes is proved. Controlled POMDP experiments confirm the decomposition is numerically exact and that K_ρ discriminates between representations where accuracy and raw impurity do not. The weakness-maximisation theorems predict optimal generalisation through least commitment, but their formal object (the extension of a policy in an embodied language) does not have a direct analogue in neural network function approximation. Bridging that gap is identified as an open problem.

Article activity feed