Generative AI Driven Synthetic Attack Augmentation for Enhanced Intrusion Detection Using an Imbalanced Dataset

Read the full article See related articles

Discuss this preprint

Start a discussion What are Sciety discussions?

Listed in

This article is not in any list yet, why not save it to one of your lists.
Log in to save this article

Abstract

Intrusion Detection Systems (IDS) are very important in ensuring the security of the modern network, but persistent problems with severe class imbalance in the datasets of the real network traffic conditions show that the minor types of attacks are highly underrepresented. Critical attacks present in the popular dataset, including Brute Force and Web Attacks, are very infrequent compared to regular traffic and high-volume attacks, which causes biased learning, high false-negativities, and bad minority attacks detection. To overcome this problem, this paper suggests a Generative AI-based synthetic attack augmentation model on Conditional Tabular Generative Adversarial Networks (CTGAN) to improve the performance of the IDS in imbalanced jobs. The given strategy is aimed at producing high-fidelity synthetic samples of minority attack classes without changing the statistical properties and behavioral patterns of actual network traffic. Training and testing of augmented data ensemble-based machine learning models, namely Random Forest and Extreme Gradient Boosting (XGBoost) are performed using the augmented dataset. Experiments using the CICIDS2017 dataset show that the detection in the minority attack is significantly improved. Synthetic augmentation boosted Recall to Web Attacks by 28 to 91 with Random Forest and 32 to 94 with XGBoost, and Brute Force detection Recall boosted by 45 to 95 and 55 to 98 respectively. Overall Recall and F1-score also gained significantly and XGBoost obtained F1-score of 94% on the augmented dataset. These findings support the hypothesis that Generative AI-based synthetic data augmentation works well in class imbalance, false negative, and increases the resilience and reliability of intrusion detection systems in real-life cybersecurity settings.

Article activity feed