Pruning and Malicious Injection: A Retraining Free Backdoor Attack on Transformer Models

Taibiao Zhao
Mingxuan Sun
Hao Wang
Xiaobing Chen
Xiangwei Zhou
Xugui Zhou

Read the full article

Discuss this preprint

Start a discussion What are Sciety discussions?

Listed in

This article is not in any list yet, why not save it to one of your lists.

Abstract

Transformer models remain vulnerable to backdoor attacks, yet existing backdoor attack methods typically require resource-intensive retraining or disruptive architectural modification of the target model. To address these limitations, we propose Head Pruning and Malicious Injection (HPMI), a retraining-free backdoor attack that preserves the target model's original architecture. This approach requires only a small subset of data and basic architectural knowledge, effectively eliminating the need for retraining. HPMI identifies and prunes the least significant attention head and surgically injects a pre-trained malicious head to establish a stealthy backdoor pathway. We provide a rigorous theoretical justification showing that HPMI is resistant to detection and removal by state-of-the-art defenses under reasonable assumptions. Experimental evaluations across multiple benchmarks validate HPMI’s effectiveness, showing that it incurs a negligible drop in clean accuracy, achieves an attack success rate exceeding 99.55%, and successfully bypasses state-of-the-art advanced defense mechanisms. Furthermore, compared with retraining-dependent baselines, HPMI achieves superior concealment and robustness while incurring minimal impact on model utility.

Version published to 10.21203/rs.3.rs-8414631/v1 on Research Square
Jan 8, 2026

Adversarial attack detection in resource-constrained environments: A stable and sequential federated learning architecture with TinyLlama-1.1B

This article has 2 authors:
1. Sevim Ceylan Böcekçi
2. Kazım Yıldız
This article has no evaluationsLatest version Feb 19, 2026
Mitigation via Adaptive Decomposition (MAD): Geometry-Guided Subspace Decomposition for Robust Backdoor Defense

This article has 3 authors:
1. Md. Hamid Borkot Tulla
2. MD Moniur Rahman Ratan
3. Abdullah Hil Safi Sohan
This article has no evaluationsLatest version Feb 25, 2026
Detecting Cryptojacking in Cloud Environments: A Systematic Review of AI-Based Defenses, Deployment Challenges, and Research Gaps

This article has 2 authors:
1. Amitabh Chakravorty
2. Nelly Elsayed
This article has no evaluationsLatest version Mar 2, 2026

Discuss this preprint

Listed in

Abstract

Article activity feed

Related articles

Adversarial attack detection in resource-constrained environments: A stable and sequential federated learning architecture with TinyLlama-1.1B

Mitigation via Adaptive Decomposition (MAD): Geometry-Guided Subspace Decomposition for Robust Backdoor Defense

Detecting Cryptojacking in Cloud Environments: A Systematic Review of AI-Based Defenses, Deployment Challenges, and Research Gaps