Confidence-Aware Pseudo-Labeling via Unsupervised Ensemble Consensus for Fraud Detection Contribution

Daniel Agyekum Amakye
Joseph Dadzie
Nana Yaw Duodu
Albert Mainu Tawiah

Read the full article

Discuss this preprint

Start a discussion What are Sciety discussions?

Listed in

This article is not in any list yet, why not save it to one of your lists.

Abstract

It is very difficult to detect fraud in financial transactions because of lack of labeled dataset, because of this traditional supervised methodology is very hard to implement. In order to overcome this problem, this study introduce a hybrid approach without the use of seed label by generated what is known as peeudo-labels with the help of unsupervised ensemble consensus of four anomaly detection models. One-Class Support Vector Machine (SVM), Isolation Forest, DBSCAN and Autoencoder. With these transactions are labeled using majority voting, with agreement scores which provide a confidence-aware hierarchy for prioritizing investigation. The pseudo-label generated from these anomaly detection models is then trained on supervised stack ensemble which is made up of XGBoost, Random Forest, SVM, 1D CNN and LSTM using Logistic Regression as the meta-learner. 2,512 bank transactional dataset results indicate that the that unsupervised ensemble recognized 181 anomalies (7.2%) with the stacked ensemble achieving a performance of 98.7% accuracy, 98.3% precision, 99.1% recall and F1 score of 98.7%. These findings illustrate that fraud detection can reliably be achieved without seed labels, using interpretable pseudo-labels to bridge the gap between unsupervised anomaly and supervised learning.

Version published to 10.21203/rs.3.rs-9313690/v1 on Research Square
Apr 16, 2026

High-Performance Phishing Email Detection Using Hybrid Machine Learning and Deep Learning Approaches

This article has 3 authors:
1. Mohamed Khayati
2. Driss Ait Omar
3. Mohamed Baslam
This article has no evaluationsLatest version Apr 7, 2026
A Multiple Instance Learning framework with Instance Identification and Supervised Contrastive Learning for WSI Classification

This article has 4 authors:
1. Liming Yuan
2. Guangcan Hu
3. Na Qin
4. Lu Zhao
This article has no evaluationsLatest version Apr 1, 2026
AI-Powered Fraud Detection in Financial Networks: A Systematic Literature Review.

This article has 5 authors:
1. Srinivas Pochincharla
2. Jenfier Lawson
3. Farhat Kabir
4. David Wilson
5. Muhammad Sameer
This article has no evaluationsLatest version Mar 31, 2026

Discuss this preprint

Listed in

Abstract

Article activity feed

Related articles

High-Performance Phishing Email Detection Using Hybrid Machine Learning and Deep Learning Approaches

A Multiple Instance Learning framework with Instance Identification and Supervised Contrastive Learning for WSI Classification

AI-Powered Fraud Detection in Financial Networks: A Systematic Literature Review.