Classification Accuracy Estimation Without Labels via Architecture-Agnostic Model Agreement

Erin Woo
Hyungkook Jun
Sangyeop Yeo
YoungIk Eom
YuSeung Ma

Read the full article

Discuss this preprint

Start a discussion What are Sciety discussions?

Listed in

This article is not in any list yet, why not save it to one of your lists.

Abstract

We propose a method to estimate the classification accuracy of a machine learning model without requiring any ground-truth labels, by leveraging the agreement between two models on unlabeled data. Unlike prior work that interprets model agreement as an upper bound or average of model accuracies, we are the first to demonstrate that in heterogeneous, label-free settings, agreement reliably approximates the performance of the weaker model in the pair. Our method is architecture-agnostic and does not require any labeled data, assumptions on model calibration, or prior performance information. We introduce a principled estimator that combines hard-label agreement, probability-level consistency, and a correction term for class imbalance and calibration bias. This estimator remains robust across diverse model types—including convolutional networks and Transformers—and performs reliably on standard image and text classification benchmarks. Experimental results confirm that our estimator closely approximates the accuracy of the weaker model in the pair, often within 1–2% of the ground-truth. Our approach operates entirely on in-distribution unlabeled data, offering a practical and reliable solution for model evaluation in real-world scenarios where labeled validation sets are unavailable.

Version published to 10.21203/rs.3.rs-7331038/v1 on Research Square
Sep 1, 2025

WITHDRAWN: Weakly-Supervised Eczema Region Segmentation via Probabilistic Label Propagation and Semantic Boundary Prediction

This article has 6 authors:
1. Aarav Mehta
2. Rakesh Nair
3. Vikram Singh
4. Tanvi Iyer
5. Ethan Mitchell
6. Tejasvi Tarun Dinesh
This article has no evaluationsLatest version Sep 17, 2025
Performance Comparison of VGG16 and ResNet Architectures on CIFAR-10 Dataset

This article has 2 authors:
1. Anubhab Parashar
2. Vedika Pande
This article has no evaluationsLatest version Sep 17, 2025
Geometric Mixture Classifier A Discriminative Per-Class Mixture of Hyperplanes for Fast, Transparent Classification

This article has 1 author:
1. Prasanth K K
This article has no evaluationsLatest version Oct 1, 2025

Discuss this preprint

Listed in

Abstract

Article activity feed

Related articles

WITHDRAWN: Weakly-Supervised Eczema Region Segmentation via Probabilistic Label Propagation and Semantic Boundary Prediction

Performance Comparison of VGG16 and ResNet Architectures on CIFAR-10 Dataset

Geometric Mixture Classifier A Discriminative Per-Class Mixture of Hyperplanes for Fast, Transparent Classification