Per-Sample Invariant Tracking on MNIST with Certainty-Validity Diagnostics

Read the full article See related articles

Discuss this preprint

Start a discussion What are Sciety discussions?

Listed in

This article is not in any list yet, why not save it to one of your lists.
Log in to save this article

Abstract

Standard evaluation metrics conflate different kinds of error. A confident incorrect predic tion and an uncertain incorrect prediction are both counted identically, even though they reflect different epistemic states. This paper uses the Certainty-Validity (CVS) framework to track how those states migrate during learning. The Minimal Operative Unit (MOU) ring serves as a synthetic diagnostic environment in which quadrant migration, lock thresholds, and interven tion timing can be calibrated under controlled conditions before the same procedure is applied per-sample on MNIST. On the ring, baseline training drives CVS from 0.79 to 0.07 while accu mulating confident-incorrect states, whereas CVS-regulated and CVS-gated regimes stabilize CVS in the 0.62–0.75 range and reduce confident-incorrect predictions by 11×. On MNIST, freeze-based CVS interventions preserve aggregate CVS but also preserve still-learnable uncer tainty, showing that aggregate control is too coarse on its own. Per-sample tracking localizes the residual instability to a small tail rather than broad model degradation: at confidence threshold θ =0.7, MNIST exhibits no persistent uncertainty population, but it does exhibit a small per sistent confident-error tail, a boundary-adjacent volatile set, and a small number of drift cases. In this setting, the main value of CVS is diagnostic: it identifies where commitment is stable, where it is forced, and where uncertainty remains unresolved during training.

Article activity feed