Assessing the Risk of Discriminatory Bias in Classification Datasets

Read the full article See related articles

Listed in

This article is not in any list yet, why not save it to one of your lists.
Log in to save this article

Abstract

Bias in machine learning models remains a critical challenge, particularly in datasets with numeric features where discrimination may be subtle and hard to detect. Existing fairness frameworks rely on expert knowledge of marginalized groups, such as specific racial groups, and categorical features defining them. Furthermore, most frameworks evaluate bias in models rather than datasets, despite the fact that model bias can often be traced back to dataset shortcomings. Our research aims to remedy this gap by capturing dataset flaws in a set of meta-features at the dataset level, and to warn practitioners of bias risk when using such datasets for model training. We neither restrict the feature type nor expect domain knowledge. To this end, we develop methods to synthesize biased datasets and extend current fairness metrics to continuous features in order to quantify dataset-level discrimination risks. Our approach constructs a meta-database of diverse datasets, from which we derive transferable meta-features that capture dataset properties indicative of bias risk. Our findings demonstrate that dataset-level characteristics can serve as cost-effective indicators of bias risk, providing a novel method for data auditing that does not rely on expert knowledge. This work lays the foundation for early-warning systems, moving beyond model-focused assessments toward a data-centric approach.

Article activity feed