Extended Counterfactual Adversarial Examples forMitigating Privacy Risk in Adversarially Robust Models
Discuss this preprint
Start a discussion What are Sciety discussions?Listed in
This article is not in any list yet, why not save it to one of your lists.Abstract
In this paper, we propose extended Counterfactual Adversarial ExampleGeneration (e-CAEG), which is an advanced version of our published conferencepaper in APWeb-WAIM 2025. Based on the conference paper, we summarizecontributions in this paper as follows. Firstly, e-CAEG leverages latent spacerepresentations to generate in-distribution adversarial examples for both targetedand untargeted scenarios. Secondly, e-CAEG acts as a regularizer that bridgesthe generalization gap by forcing the model to rely on robust semantic features.Finally, experiments on MNIST and Fashion-MNIST, supported by t-SNE distributionalvisualizations, demonstrate that our approach effectively lowers membershipinference accuracy to near-random levels while preserving model utility.Furthermore, we analyze the trade-offs between accuracy, robustness, and privacy,identifying an optimal balance achieved when approximately 95% of thetraining data consists of e-CAEG-generated examples.