GAN_BERT: An Advanced Neural Architecture for Effective Fraud Detection on Imbalanced Datasets

Hao Wang
Yuxin Gong
Chang Yu

Read the full article

Listed in

This article is not in any list yet, why not save it to one of your lists.

Abstract

In the financial sector, fraud detection tasks have posed a significant challenge to researchers for a long time, particularly in scenarios characterized by a highly imbalanced dataset. Due to the rare occurrence of fraudulent activities, unfortunately, significantly imbalanced datasets are common, leading to the limitations of traditional machine learning models to generalize well on minority classes. To address this challenge, we introduce GAN\_BERT, a hybrid neural framework architecture that combines Conditional Tabular Generative Adversarial Networks (CTGAN) for synthetic data generation with a transformer-based Bidirectional Encoder Representations from Transformers (BERT) classifier. Within GAN\_BERT, each component targets on different issues: the CTGAN module captures intrinsic patterns hidden behind fraud records, then generates high-quality synthetic samples for training. The data loader module prepares training data and synthetic samples in a stratified way, which substantially leverages up the model exposure for minority classes. Lastly, the classifier module learns the tempura relationship among fraud transactions, then identifies the fraud activities accurately while maintaining low false alarm rate. Running through the benchmark datasets with other state-of-art models, GAN\_BERT demonstrates noticeably improvements in precision, recall and F1-score for the minority class. We propose this innovative neural network architecture, GAN\_BERT, to be a robust, flexible, and scalable solution for fraud detection tasks especially on imbalanced datasets. Our research achievements may also be applicable to other domains facing similar challenges.

Version published to 10.20944/preprints202506.1138.v1
Jun 13, 2025

Fraud Detection Pipeline Using Machine Learning: Methods, Applications, and Future Directions

This article has 1 author:
1. Arimondo Scrivano
This article has no evaluationsLatest version Jul 4, 2025
Fraud Detection in Online Transactions: Toward Hybrid Supervised–Unsupervised Learning Pipelines

This article has 4 authors:
1. Shuo Xu
2. Yuchen Cao
3. Zhongyan Wang
4. Yexin Tian
This article has no evaluationsLatest version May 14, 2025
Robust Anomaly Detection in Financial Markets Using LSTM Autoencoders and Generative Adversarial Networks

This article has 2 authors:
1. Jian Yang
2. Lili Liu
This article has no evaluationsLatest version Jun 25, 2025

Listed in

Abstract

Article activity feed

Related articles

Fraud Detection Pipeline Using Machine Learning: Methods, Applications, and Future Directions

Fraud Detection in Online Transactions: Toward Hybrid Supervised–Unsupervised Learning Pipelines

Robust Anomaly Detection in Financial Markets Using LSTM Autoencoders and Generative Adversarial Networks