Effective Credit Card Fraud Detection Using Data Mining Techniques

Ooi Jing Xian

Read the full article

Discuss this preprint

Start a discussion What are Sciety discussions?

Listed in

This article is not in any list yet, why not save it to one of your lists.

Abstract

Businesses and consumers around the world face financial and security problems related to credit card fraud. Fraudulent activities are becoming more sophisticated, and therefore the need for effective and/or efficient fraud detection systems has become essential. This study focuses on how machine learning techniques can be applied to detect credit card fraud specifically, and how to overcome challenges like class imbalance, high dimensionality and complexity of real-world data sets. The IEEE-CIS Fraud Detection dataset, a publicly available and highly complex dataset, was utilized to evaluate the performance of various machine learning models. This study compares five machine learning models which are Logistic Regression, Random Forest, XGBoost, LightGBM, and Deep Neural Networks (DNN), to establish a performance baseline using the full dataset with the k-fold stratified cross validation method. Feature engineering was subsequently performed on the best-performing model (LightGBM), utilizing gain-based importance and cumulative feature importance to identify and retain the most relevant features. The reduced dataset was used to retrain the model, and its performance was evaluated against the full dataset to assess the effectiveness of the feature engineering process. An important finding is that feature engineering helped to reduce dataset dimensionality and improve model predictive performance, especially for fraudulent transaction detection. Consequently, the results showcase ensemble methods and advanced feature selection techniques as a possibility for constructing robust fraud detection systems. This research adds to the literature of machine learning applications in the area of fraud detection and it advances our understanding of how to obtain a balance between computational efficiency, interpretability, and accuracy. This study addressed to limitations of the traditional approaches and used state of the art machine learning methodologies in order to provide practical and theoretical contributions to the fight against credit card fraud and for future research and to real world implementations.

Version published to 10.51244/ijrsi.2025.1213cs008
Nov 15, 2025
Version published to 10.21203/rs.3.rs-6618792/v1 on Research Square
May 9, 2025

Beyond Accuracy: Economic Performance of Machine Learning Models in Financial Fraud Detection

This article has 3 authors:
1. Pedro-Pablo Chambi-Condori
2. Miriam Chambi-Vásquez
3. Telma Saravia-Ticona
This article has no evaluationsLatest version Feb 25, 2026
Improving Medicare Fraud Detection Accuracy in Deep Learning by Exploring Feature Selection and Data Sampling Techniques

This article has 3 authors:
1. Fahad Ahammed
2. Bayan Al Barakati
3. Oge Marques
This article has no evaluationsLatest version Mar 20, 2026
Detection of Fraudulent Internship Opportunities Using Machine Learning Techniques

This article has 1 author:
1. Priyanshi Rajendrakumar Patel
This article has no evaluationsLatest version Feb 6, 2026

Discuss this preprint

Listed in

Abstract

Article activity feed

Related articles

Beyond Accuracy: Economic Performance of Machine Learning Models in Financial Fraud Detection

Improving Medicare Fraud Detection Accuracy in Deep Learning by Exploring Feature Selection and Data Sampling Techniques

Detection of Fraudulent Internship Opportunities Using Machine Learning Techniques