Reducing Fraud with Anomaly Detection Algorithms
Listed in
This article is not in any list yet, why not save it to one of your lists.Abstract
Financial statement fraud remains a persistent challenge despite extensive audit work andadvances in AI-driven data processing. This study examines cases where engagements labeled as“low risk” nevertheless result in significant misstatements or irregularities. Using the public Big 4Financial Risk & Compliance dataset (2020–2025), an audit failure is defined when the ratio ofdetected fraud cases to high-risk cases falls below 0.5. Two tree-based classifiers—RandomForest and XGBoost—are trained on firm- and engagement-level features (total auditengagements, high-risk case count, revenue impact, employee workload, audit effectivenessscore, and AI usage). SHAP (SHapley Additive exPlanations) analysis ranks total revenue impact,employee workload, and audit effectiveness score as the strongest global risk drivers, whileindividual waterfall plots show how heavy workloads and lower effectiveness scores drive specificcases into the “failure” category—even when AI tools are deployed. Evaluated across fiveMD5-derived seeds, both models achieve over 95 % recall, demonstrating robust sensitivity indetecting audit failures. These results identify actionable audit levers—such as workloadmanagement and effectiveness improvements—to reduce undetected fraud, providing transparent,data-driven guidance for smarter audit practices and informing future regulatory standards.