Improving Transferability of Adversarial Examples with Mixed-Representation Attack

Yunfei Long
Zilin Tian
Liguo Zhang
Haocheng Xu
Huosheng Xu

Read the full article

Discuss this preprint

Start a discussion What are Sciety discussions?

Listed in

This article is not in any list yet, why not save it to one of your lists.

Abstract

Although deep neural networks (DNNs) have achieved remarkable performance in the image classification task, they remain highly vulnerable to adversarial examples, which are crafted by adding human-imperceptible perturbations to benign samples. An important aspect is their transferability, which refers to the ability to deceive target black-box models, enabling attacks in the black-box setting to assess and understand the robustness of DNNs. Recently, various methods have been proposed to boost the adversarial transferability, among which the input transformation is one of the most effective approaches. We observe that most existing methods in this direction perform geometric transformations in the spatial domain, ignoring the potential transformation in the latent space, which may limit the transferability of adversarial examples. To tackle this issue, we propose a novel Mixed-Representation Attack ( MRA ) to augment the input diversity by exploiting transformations on latent representations. Specifically, MRA leverages a Variational Autoencoder to generate representations of the input image and images randomly sampled from different categories, and then reconstruct images based on the mixed representations. Instead of directly computing the average gradient over the reconstructed images, MRA calculates the gradient on the original input mixed with each reconstructed image to generate more transferable adversaries. Extensive experiments on the ImageNet-compatible dataset demonstrate that our MRA achieves state-of-the-art transferability, significantly outperforming various input transformation attacks. Source code will be released in https://github.com/unclelongheu/Mixed-Representation-Attack

Version published to 10.21203/rs.3.rs-7544991/v1 on Research Square
Sep 12, 2025

WITHDRAWN: Cross-Domain Generalization with Noise-Augmented Loss Function for Burn Area Detection

This article has 3 authors:
1. Jiwon Han
2. Haeun Cho
3. Gyu-Jin An
This article has no evaluationsLatest version Sep 4, 2025
Performance Measurement and Analysis ofCertifiable Defenses against Adversarial PatchAttacks

This article has 3 authors:
1. Zeyu Gao
2. Amin Saremi
3. Zonghua Gu
This article has no evaluationsLatest version Sep 22, 2025
A Hybrid Ensemble Approach for Robust Detection of Adversarial Attacks on Medical X-ray Images

This article has 4 authors:
1. Yassine Chahid
2. Anas Chahid
3. Ismail Chahid
4. Aissa Kerkour Elmiad
This article has no evaluationsLatest version Oct 1, 2025

Discuss this preprint

Listed in

Abstract

Article activity feed

Related articles

WITHDRAWN: Cross-Domain Generalization with Noise-Augmented Loss Function for Burn Area Detection

Performance Measurement and Analysis ofCertifiable Defenses against Adversarial PatchAttacks

A Hybrid Ensemble Approach for Robust Detection of Adversarial Attacks on Medical X-ray Images