DL-DPGAN: A Correlation-Regularized Differentially Private GAN for Privacy-Utility Balanced Synthetic Data Generation

Read the full article See related articles

Discuss this preprint

Start a discussion What are Sciety discussions?

Listed in

This article is not in any list yet, why not save it to one of your lists.
Log in to save this article

Abstract

Balancing data utility and privacy in machine learning is crucial, particularly in settings where large volumes of sensitive data are collected and analyzed. Generative Adversarial Networks (GANs) and their privacy-preserving variants, Differentially Private GANs (DPGANs), have been employed to create synthetic data that supports privacy protection. However, prior work has shown that, in practice, synthetic data generated by such models can still exhibit privacy risks, especially when trained on confidential data. In this study, we propose Double Loss DPGAN (DL-DPGAN), an enhanced framework that incorporates a correlation-based privacy loss together with the Wasserstein distance to reduce information leakage and improve training stability. To evaluate the model’s performance , we compare the empirical privacy behavior and the quality of the synthetic data generated by DL-DPGAN with those produced by standard GAN and DPGAN baselines. Experiments on multiple benchmark datasets indicate that DL-DPGAN generates synthetic data with strong resistance to distinguisha-bility while maintaining competitive classifier performance compared to these baselines. Overall, DL-DPGAN offers a reasonable balance between privacy protection and model utility, providing a practical approach for privacy-preserving and accurate machine learning.

Article activity feed