DEFEND: Intelligent Temporal Backdoor Detection and Mitigation in Federated Learning via Reinforcement Learning-Coordinated Multi-Layer Defense
Discuss this preprint
Start a discussion What are Sciety discussions?Listed in
This article is not in any list yet, why not save it to one of your lists.Abstract
Collaborative machine learning in financial systems faces an escalating security threat: temporal backdoor attacks that exploit multi-round dependencies to systematically compromise fraud detection and risk assessment models—a challenge that existing static defense mechanisms cannot adequately counter. This paper presents DEFEND (DEep Federated Ensemble Network Defense), a comprehensive framework that integrates multi-layer defense with reinforcement learning-based adaptive coordination to counter sophisticated temporal backdoor strategies in federated learning environments. The framework introduces four key innovations: (1) Temporal Behavioral Analysis Layer employing multi-scale statistical profiling with dynamic time warping for attack pattern recognition across communication rounds, (2) Byzantine-Robust Statistical Aggregation using geometric median estimation with adaptive outlier detection, (3) Multi-Scale Validation Protocol with automated model rollback mechanisms, and (4) MDP-based Defense Coordination formulating security decisions as a Markov Decision Process optimized via Proximal Policy Optimization to dynamically balance robustness and utility. Extensive experiments on the FinMultiTime dataset across three distinct market periods (2009-2025) demonstrate superior performance over state-of-the-art baselines, achieving defense success rates of 95.6\%$\pm$1.0\% for ResNet-18 and 94.0\%$\pm$1.2\% for MobileNet-V2 while maintaining clean accuracy above 85\%. Ablation studies reveal that the MDP-based coordination provides the largest individual contribution (8.2\% defense success rate improvement), while the complete multi-layer architecture achieves up to 18.7\% improvement over single-layer baselines. Cross-period generalization analysis demonstrates robust transferability with less than 6\% performance degradation across different market regimes, validating practical deployment viability in dynamic financial environments.