GAN-based Synthetic Data Generation for Minority Intrusion Classes in IoT Datasets

James Henderson
Micheal Norman

Read the full article

Discuss this preprint

Start a discussion What are Sciety discussions?

Listed in

This article is not in any list yet, why not save it to one of your lists.

Abstract

The proliferation of Internet of Things (IoT) devices has heightened the need for robust Intrusion Detection Systems (IDS) capable of identifying a wide spectrum of cyber threats. However, a persistent challenge in IoT intrusion detection is the significant class imbalance in publicly available datasets, where minority intrusion classes—such as User-to-Root (U2R) and Remote-to-Local (R2L) attacks—are severely underrepresented. This imbalance leads to poor detection performance for rare but critical attack types. In this study, we propose a Generative Adversarial Network (GAN)-based framework for generating synthetic intrusion samples specifically targeting these minority classes. Our approach involves training class-conditional GANs to learn the data distribution of underrepresented attacks and generate high-fidelity synthetic samples, which are then used to augment the training set of conventional classifiers. We conduct extensive experiments using benchmark IoT intrusion datasets, including Bot-IoT and CICIDS2017, and evaluate the impact of GAN-based augmentation on multiple machine learning classifiers. The results demonstrate that incorporating GAN-generated samples significantly improves classification metrics—particularly recall and F1-score—for minority classes, without degrading overall system performance. Compared to traditional oversampling methods like SMOTE, our GAN-based approach achieves more realistic sample generation and better generalization. This research highlights the potential of deep generative models to address data imbalance in cybersecurity applications, offering a promising direction for enhancing the accuracy and reliability of IDS in IoT environments.

Version published to 10.20944/preprints202507.0325.v1
Jul 3, 2025

Generative AI Driven Synthetic Attack Augmentation for Enhanced Intrusion Detection Using an Imbalanced Dataset

This article has 3 authors:
1. Mamoona Nawaz
2. Shireen Tahira
3. Anum Yasmin
This article has no evaluationsLatest version Dec 17, 2025
A Privacy Preserving Federated Transformer Framework with Reinforcement Learning for Adaptive IoT Intrusion Detection

This article has 3 authors:
1. Mussawir Ejaz
2. Muhammad Zulkifl Hasan
3. Muhammad Zunnurain Hussain
This article has no evaluationsLatest version Jan 19, 2026
A Hybrid Deep Learning based Intrusion Detection Framework to Identify Cyber Attacks in Edge-based IIoT

This article has 5 authors:
1. Mahmood Al-Bahri
2. Mohammed Saleh Ali Muthanna
3. Muhammad Zakarya
4. Reem Alkanhel
5. Ayaz Ali Khan
This article has no evaluationsLatest version Dec 17, 2025

Discuss this preprint

Listed in

Abstract

Article activity feed

Related articles

Generative AI Driven Synthetic Attack Augmentation for Enhanced Intrusion Detection Using an Imbalanced Dataset

A Privacy Preserving Federated Transformer Framework with Reinforcement Learning for Adaptive IoT Intrusion Detection

A Hybrid Deep Learning based Intrusion Detection Framework to Identify Cyber Attacks in Edge-based IIoT