Mixed Perturbation: Generating Directionally Diverse Perturbations for Adversarial Training
Listed in
This article is not in any list yet, why not save it to one of your lists.Abstract
The adversarial vulnerability of deep learning models is a critical issue that must be addressed to ensure the safe commercialization of AI technologies. Although numerous studies on adversarial defense methods are actively being conducted from various perspectives, most of them still provide limited robustness, and even the relatively trusted adversarial training is no exception. To develop more reliable defense methods, ongoing research exploring the properties and causes of adversarial vulnerabilities is essential. In this study, we focus on a hypothesis regarding the existence of adversarial examples: The adversarial examples represent low-probability “pockets” in the manifold. Assuming that the hypothesis holds true, we propose a method for generating perturbation: “mixed perturbation (MP)”, which aims at discovering diverse pocket samples in a defensive perspective. The proposed method generates perturbations by leveraging information from both the main task and auxiliary tasks in multi-task learning scenarios, combining them through random weighted summation. The generated mixed perturbation intends to maintain the primary directionality of the main task perturbation to improve the model’s main task recognition performance while introducing variability in the perturbation directions. We then utilize them for adversarial training to form more robust decision boundary. Through experiments and analyses conducted on five benchmark datasets, we validated the effectiveness of our proposed method.