Financial Statement Fraud Detection Through an Integrated Machine Learning and Explainable AI Framework

Tsolmon Sodnomdavaa
Gunjargal Lkhagvadorj

Read the full article

Discuss this preprint

Start a discussion What are Sciety discussions?

Listed in

This article is not in any list yet, why not save it to one of your lists.

Abstract

Financial statement fraud remains a substantial risk in environments marked by weak regulatory oversight and information asymmetry. This study develops a decision-centric framework that integrates machine learning, explainable artificial intelligence, and decision curve analysis to improve fraud detection under severe class imbalance. Using 969 firm-year observations from 132 Mongolian firms (2013–2024), we evaluate 21 financial ratios with models including Random Forest, XGBoost, LightGBM, MLP, TabNet, and a Stacking Ensemble trained with SMOTE and class-weighted learning. Performance was assessed using PR-AUC, F1-score, Recall, and DeLong-based significance testing. The Stacking Ensemble achieved the strongest results (PR-AUC = 0.93; F1 = 0.83), outperforming both classical and modern baseline models. Interpretability analyses (SHAP, LIME, and counterfactual explanations) consistently identified leverage, profitability, and liquidity indicators as dominant drivers of fraud risk, supported by a SHAP Stability Index of 0.87. Decision curve analysis showed that calibrated thresholds improved decision efficiency by 7–9% and reduced over-audit costs by 3–4%, while an audit cost simulation estimated annual savings of 80–100 million MNT. Overall, the proposed ML–XAI–DCA framework offers a transparent, interpretable, and cost-efficient approach for enhancing fraud detection in emerging-market contexts with limited textual disclosures.

Version published to 10.3390/jrfm19010013
Dec 24, 2025
Version published to 10.20944/preprints202510.1857.v1
Oct 24, 2025

Comparative Performance of Deep Learning Models for Financial Statement Fraud Detection in an Imbalanced Classification Setting

This article has 2 authors:
1. Tsolmon Sodnomdavaa
2. Lkhamdulam Ganbat
This article has no evaluationsLatest version Jan 7, 2026
Integrating Model Explainability and Uncertainty Quantification for Trustworthy Fraud Detection

This article has 2 authors:
1. Tebogo Mapaila
2. Makhamisa Senekane
This article has no evaluationsLatest version Jan 7, 2026
Mining Financial Data for Fraud Detection using Ensemble Learning and Outlier Detection

This article has 2 authors:
1. Manimegalai R
2. Vijayalaskhmi P
This article has no evaluationsLatest version Dec 10, 2025

Discuss this preprint

Listed in

Abstract

Article activity feed

Related articles

Comparative Performance of Deep Learning Models for Financial Statement Fraud Detection in an Imbalanced Classification Setting

Integrating Model Explainability and Uncertainty Quantification for Trustworthy Fraud Detection

Mining Financial Data for Fraud Detection using Ensemble Learning and Outlier Detection