Harnessing Exploratory Data Analysis for Robust Financial Fraud Detection and Model Enhancement
Listed in
This article is not in any list yet, why not save it to one of your lists.Abstract
This paper explores the critical role of Exploratory Data Analysis (EDA) in detecting fraud and ensuring robust machine learning model performance. By applying both univariate and multivariate EDA techniques, including graphical and non-graphical methods, key trends and relationships within the dataset were uncovered. The analysis reveals significant variability in financial data associated with fraud cases, particularly highlighting the increased scale of fraud in the early 2000s. The EDA process facilitated the identification of outliers, correlations, and potential data quality issues, such as missing values and inconsistencies. Additionally, EDA informed the necessary data transformations and feature engineering steps that ultimately improved the performance of machine learning models. Using Random Forest and Classification and Regression Trees algorithms, the models demonstrated strong classification accuracy and generalized effectively to new data. The findings underscore the importance of EDA in the data modeling process, particularly in fraud detection, where understanding underlying patterns and relationships is essential for developing reliable predictive models.