Certainty-Validity: A Diagnostic Framework for Discrete Commitment Systems

Datorien L. Anderson

Read the full article

Discuss this preprint

Start a discussion What are Sciety discussions?

Listed in

This article is not in any list yet, why not save it to one of your lists.

Abstract

Standard evaluation metrics for machine learning—accuracy, precision, recall, and AUROC— assume that all errors are equivalent: a confident incorrect prediction is penalized identically to an uncertain one. For discrete commitment systems (architectures that select committed states {−W, 0,+W}), this assumption is epistemologically flawed. We introduce the Certainty- Validity (CVS) Framework, a diagnostic method that decomposes model performance into a 2×2 matrix distinguishing high/low certainty from valid/invalid predictions. This framework reveals a critical failure mode hidden by standard accuracy: Confident-Incorrect (CI) behavior, where models hallucinate structure in ambiguous data. Through ablation experiments on Fashion-MNIST, EMNIST, and IMDB, we analyze the “83% Ambiguity Ceiling”—a stopping point where this specific discrete architecture consistently plateaus on noisy benchmarks. Unlike continuous models that can surpass this ceiling by memorizing texture or statistical noise, the discrete model refuses to commit to ambiguous samples. We show that this refusal is not a failure but a feature: the model stops where structural evidence ends. However, standard training on ambiguous data eventually forces Benign Overfitting, causing a pathological migration from Uncertain-Incorrect (UI) (appropriate doubt) to Confident-Incorrect (CI) (hallucination). We propose that “good training” for reasoning systems must be defined not by accuracy, but by maximizing the Certainty-Validity Score (CVS)—ensuring the model knows where to stop.

Version published to 10.21203/rs.3.rs-8845570/v1 on Research Square
Feb 12, 2026

Decision Control Under Structured Uncertainty Using a Quantum-Inspired Density Matrix Framework

This article has 1 author:
1. Aman Sharma
This article has no evaluationsLatest version Feb 9, 2026
Epistemic Bridge Protocol: Regulating Epistemic Commitment in Probabilistic Generative Agents

This article has 2 authors:
1. AZRIL BIN HAMZAH
2. TENG SHASHA
This article has no evaluationsLatest version Jan 16, 2026
Bounding the Long Tail: AI Norms for Decision-Making under Negligible Probabilities

This article has 1 author:
1. Vipul Razdan
This article has no evaluationsLatest version Jan 7, 2026

Discuss this preprint

Listed in

Abstract

Article activity feed

Related articles

Decision Control Under Structured Uncertainty Using a Quantum-Inspired Density Matrix Framework

Epistemic Bridge Protocol: Regulating Epistemic Commitment in Probabilistic Generative Agents

Bounding the Long Tail: AI Norms for Decision-Making under Negligible Probabilities