A Two-Layer BiLSTM distillation-based method for network intrusion detection

Read the full article See related articles

Listed in

This article is not in any list yet, why not save it to one of your lists.
Log in to save this article

Abstract

Traditional intrusion detection systems (IDS) based on rules and signatures face difficulties in processing complex temporal data and imbalanced datasets, such as loosely connected temporal context features, label mismatches, and the imbalance caused by a diverse range of data types, which significantly affect detection performance. To address these challenges, this paper proposes a dual-layer BiLSTM distillation-based network intrusion detection method (2DBM). This method integrates bidirectional LSTM (BiLSTM) layers with a dual distillation mechanism to capture temporal context features in both forward and backward directions, and enhances the model’s generalization ability when handling imbal-anced datasets through knowledge transfer between the teacher and student models. Key innovations include: (1) A dual distillation framework, transferring knowledge from the high-accuracy teacher model to the lightweight student model, reducing parameters by 38% (from 7.8M to 4.8M) and reducing inference latency by 60% (from 30ms to 12ms); (2) A temporal attention mechanism, 1 dynamically weighting key network traffic features using BiLSTM to improve robustness against imbalanced data; (3) A dual-layer BiLSTM architecture that captures bidirectional contextual dependencies while maintaining computational efficiency, further enhancing model accuracy and performance. Experiments conducted on the UNSW NB15 and CIC IDS2017 datasets show that the model achieves an accuracy of 99.86% on the UNSW NB15 dataset and 99.32% on the CIC IDS2017 dataset, with a false positive rate (FPR) below 0.5%. Compared to existing state-of-the-art models (Transformer-IDS and LightGRU), the proposed model demonstrates superior accuracy and real-time performance, effectively addressing the performance bottleneck of traditional IDS in handling imbalanced data and complex temporal contexts, while improving detection accuracy, real-time performance, and model lightweighting.

Article activity feed