Classification Accuracy Estimation Without Labels via Architecture-Agnostic Model Agreement

Read the full article See related articles

Listed in

This article is not in any list yet, why not save it to one of your lists.
Log in to save this article

Abstract

We propose a method to estimate the classification accuracy of a machine learning model without requiring any ground-truth labels, by leveraging the agreement between two models on unlabeled data. Unlike prior work that interprets model agreement as an upper bound or average of model accuracies, we are the first to demonstrate that in heterogeneous, label-free settings, agreement reliably approximates the performance of the weaker model in the pair. Our method is architecture-agnostic and does not require any labeled data, assumptions on model calibration, or prior performance information. We introduce a principled estimator that combines hard-label agreement, probability-level consistency, and a correction term for class imbalance and calibration bias. This estimator remains robust across diverse model types—including convolutional networks and Transformers—and performs reliably on standard image and text classification benchmarks. Experimental results confirm that our estimator closely approximates the accuracy of the weaker model in the pair, often within 1–2% of the ground-truth. Our approach operates entirely on in-distribution unlabeled data, offering a practical and reliable solution for model evaluation in real-world scenarios where labeled validation sets are unavailable.

Article activity feed