Machine metacognition improves classification performance and uncertainty quantification
Discuss this preprint
Start a discussion What are Sciety discussions?Listed in
This article is not in any list yet, why not save it to one of your lists.Abstract
A key issue in contemporary artificial intelligence -- ranging from large language models to specialized systems for image classification -- is the capacity to express uncertainty in a decision or judgment. Deep neural networks often produce poorly calibrated confidence judgments, making it difficult to reliably use or integrate them with human decision-makers especially in high-stakes domains like medical diagnosis. To address this issue, we propose a latent-space density approach to estimate item familiarity as a second-order (meta)representation of uncertainty. First, we trained an EfficientNet deep convolutional neural network on 10,000 dermoscopic images for melanoma classification and extracted the penultimate layer activations. Rather than directly using these features only for classification, these layer activations were also passed to a k-nearest neighbour algorithm and used to calculate a composite measure of neighbourhood consistency. Comparing ROC-AUC scores from logistic classifiers trained on original (i) softmax outputs, (ii) density metrics of neighborhood consistency, or (iii) both, we found that including density-based signals alongside softmax outputs substantially improved sensitivity and identification, especially for high-confidence errors. This method provides a principled and computationally efficient method for uncertainty estimation in neural networks, borrowing heuristics from human metacognition to construct a practical and widely applicable approach to machine metacognition.