Federated Deep Learning and Explainable AI for Real-Time Credit Card Fraud Detection in Highly Imbalanced Transaction Streams
Discuss this preprint
Start a discussion What are Sciety discussions?Listed in
This article is not in any list yet, why not save it to one of your lists.Abstract
The fraud detection systems using credit cards available today would need to contend, on the one hand, with the dynamically evolving cyber-fraud tactics and, on the other hand, the high-security data protection compliance regulations, as well as the volume and dynamics of the streaming records of transactions. The existing ensemble learning systems (Random Forest + SMOTE) are extended into the new Federated Deep Learning (FDL) and into the Explainable AI (XAI) leading to the fraud detection in real-time. The proposed framework is a combination of the BiLSTM-CNN hybrid model to track temporal as well as spatial patterns within the transactions and, in the meantime, preserve the data locality by means of a federated aggregation. Extreme class imbalance is one of the challenging problems in machine learning application and, hence an oversampling process with the help of the Generative Adversarial Network (GAN) is proposed to learn the minority distribution of frauds better in comparison to a conventional SMOTE. Moreover, layers of explanability like SHAP and LIME are implemented to have explainable risk scores of each transaction, which helps to trust and be obedient of the financial regulations. Experiments were carried out on multi-institutional data of credit cards (2.3M + transactions) using a heterogeneous bank simulated testbed. The test data indicates the 4.8 increase in the recall of the original BigBird counterparts, and the 3.2 reduction in the false positives at the baseline Random Forest models on addition of sub-300ms to the real-time prediction. The SHAP-based explanations indicated important temporal-spending deviations that align with the knowledge of the researcher of the fraud case. These findings verify that the suggested system can provide the right combination of accuracy and low latency and be interpretable, therefore, be applied to distributed financial systems.