Harnessing Exploratory Data Analysis for Robust Financial Fraud Detection and Model Enhancement

Read the full article See related articles

Listed in

This article is not in any list yet, why not save it to one of your lists.
Log in to save this article

Abstract

This paper explores the critical role of Exploratory Data Analysis (EDA) in detecting fraud and ensuring robust machine learning model performance. By applying both univariate and multivariate EDA techniques, including graphical and non-graphical methods, key trends and relationships within the dataset were uncovered. The analysis reveals significant variability in financial data associated with fraud cases, particularly highlighting the increased scale of fraud in the early 2000s. The EDA process facilitated the identification of outliers, correlations, and potential data quality issues, such as missing values and inconsistencies. Additionally, EDA informed the necessary data transformations and feature engineering steps that ultimately improved the performance of machine learning models. Using Random Forest and Classification and Regression Trees algorithms, the models demonstrated strong classification accuracy and generalized effectively to new data. The findings underscore the importance of EDA in the data modeling process, particularly in fraud detection, where understanding underlying patterns and relationships is essential for developing reliable predictive models.

Article activity feed