A Deep learning based network traffic classification algorithm using data augmentation and multi-feature fusion
Listed in
This article is not in any list yet, why not save it to one of your lists.Abstract
Network traffic classification plays an important role in network security and network resources optimization. Deep learning techniques can substantially improve the accuracy of network traffic classification. In order to solve the challenges such as imbalanced traffic data (e.g., insufficient malicious traffic training data) and the suboptimal performance of single-feature extraction networks in existing methods, in this paper, a novel network traffic classification algorithm is proposed where the data augmentation is utilized for the small number of training data expansion and the multiple feature extraction models are integrated for feature fusion. First, the generative adversarial networks (GANs) and denoising diffusion probabilistic models (DDPMs) are proposed to augment the training data categories with limited traffic data. And then one dimensional time-series training data is transformed into a color image using the Markov transition field (MTF) technique. Next, the VGG16, Resnet18, and Vision Transformer (ViT) networks are utilized to extract the features for feature fusion. The feature extraction networks is pre-trained on large-scale datasets. By frozing the parameters of the convolutional layers across the three networks, only the newly added fully connected layers are trained which leads to a more efficient feature extraction and fusion. At last, the offline classification training is performed to obtain the network traffic classification model. The application of GANs and DDPMs for data augmentation ensures data balance and diversity. Multiply feature extraction and fusion can obtain more comprehensive data characteristics which improves the quality of extracted features and classification performance. Experimental results demonstrate that the proposed algorithm performs better than some existing methods in terms of network traffic data classification accuracy.