Stochastic Optimal Transport for Fair Representation Learning in Imbalanced Data Regimes

Read the full article See related articles

Discuss this preprint

Start a discussion What are Sciety discussions?

Listed in

This article is not in any list yet, why not save it to one of your lists.
Log in to save this article

Abstract

In this paper, we propose a novel framework for achieving fair representation learning from imbalanced datasets using stochastic optimal transport (OT) theory. Recognizing the detrimental effects of data imbalance on predictive accuracy and fairness, we formulate an optimization problem that integrates fairness constraints with the mathematical principles of optimal transport. By leveraging Wasserstein barycenters, our approach is designed to produce latent representations that maintain demographic parity and equal opportunity, effectively mitigating bias against underrepresented groups. We provide a comprehensive theoretical analysis demonstrating the convergence properties of our method and establish fairness bounds that highlight its effectiveness. Empirical evaluations across diverse datasets, including the Adult Income and COMPAS datasets, exemplify our method's superiority over state-of-the-art techniques in reducing bias while preserving predictive performance. Our findings underscore the critical importance of incorporating fairness into representation learning frameworks and pave the way for future research in equitable AI systems. This work contributes a robust methodology that can enhance fairness in various applications, reinforcing a commitment to ethical standards in artificial intelligence.

Article activity feed