Robust Fraud Detection with Ensemble Learning: A Case Study on the IEEE-CIS Dataset

Read the full article See related articles

Listed in

This article is not in any list yet, why not save it to one of your lists.
Log in to save this article

Abstract

The rapid growth of digital financial transactions has led to a corresponding increase in credit card fraud, necessitating the development of sophisticated detection systems. This paper presents a comprehensive analysis of advanced ensemble learning techniques for imbalanced fraud detection using the IEEE-CIS dataset. We address the critical challenges of extreme class imbalance, concept drift, and real-time detection requirements through systematic evaluation of ensemble methods, including Random Forest, XGBoost, LightGBM, and novel stacking approaches. Our methodology incorporates advanced data balancing techniques (SMOTE, ADASYN, Borderline-SMOTE) and feature engineering strategies optimized for the IEEE-CIS dataset containing 590,540 transactions with 3.5% fraud rate. Experimental results demonstrate that our proposed ensemble stacking approach achieves superior performance with 91.8% AUC-ROC, 0.891 AUC-PR, and significant improvements in fraud detection rates while maintaining low false positive rates. The study provides empirical evidence for the effectiveness of ensemble methods in handling severely imbalanced financial fraud datasets and offers practical insights for real-world implementation.

Article activity feed