Multi-Patch Grid Attack: A Distributed and Defense-Resilient Backdoor in Federated Medical Attention Models
Discuss this preprint
Start a discussion What are Sciety discussions?Listed in
This article is not in any list yet, why not save it to one of your lists.Abstract
Vision Transformers (ViTs) are increasingly being adopted in federated medical imaging; however, their patch-based attention architecture introduces unexplored backdoor vul- nerabilities. Existing attacks target convolutional neural net- work (CNN) architectures with localized triggers and are readily mitigated by norm-based and clustering defenses, leaving federated ViT systems under-evaluated. We pro- pose the multi-patch grid attack (MPGA), a distributed back- door that exploits ViT patch tokenization by placing syn- chronized perturbations across four corner patches (8×8 pixels, intensity 0.9) and a central cross pattern (1-pixel width, intensity 0.6), distributing malicious features across multi- ple attention tokens. This design maintains gradient sim- ilarity to benign updates (cosine similarity 0.89, norm ra- tio 0.944±0.08) and pixel-level distribution divergence be- low DK L =0.0123. Evaluated on the Figshare brain tumor MRI dataset across 10 federated clients, MPGA achieves 94.13% backdoor success rate (BSR) of 94.13 %, with 86.13% main task accuracy (MTA) preserved, and generalizes to the IEEE Dataport dataset (95.00% BSR, 89.33% MTA). The back- door persists after attack cessation, retaining 87.00% ASR after 15 consecutive benign-only rounds. Against ten state- of-the-art defenses spanning Byzantine-robust aggregation, Sybil detection, behavioral monitoring, and trigger-agnostic inversion (Neural Cleanse, ABS, STRIP), no defense meets the combined success criteria (ASR<40%, MTA>85%, F1>0.70). Compared with BadNets and DBA, MPGA achieves the high- est ASR across all evaluated defenses, confirming that ViT patch-boundary alignment is the key differentiator. These results expose fundamental limitations of current defense paradigms against architecture-aware distributed backdoors in federated ViT systems.