Enhancing Subscription Fraud Detection Through Ensemble Learning: The Case of Ethio Telecom

Read the full article See related articles

Listed in

This article is not in any list yet, why not save it to one of your lists.
Log in to save this article

Abstract

Telecommunication companies globally face the critical challenge of subscription fraud, which threatens both financial stability and national security. This research addresses this issue by developing an advanced fraud detection model specifically for Ethio Telecom. The model utilizes Ensemble and Adaptive Learning techniques to enhance detection accuracy by combining multiple classifiers. The study used a dataset of 1,000,000 Call Detail Records (CDRs) collected over two months known for increased fraudulent activity3. After filtering out irrelevant data and aggregating multiple call records per subscriber, the dataset was refined to 349,164 records. Initially, 16 features were analyzed, with four excluded for lacking relevance. The remaining 11 features, excluding the target variable, underwent preprocessing including data cleaning, transformation, and balancing4. Feature selection, utilizing Correlation Matrix and Random Forest importance analysis, led to the removal of four additional features, resulting in a final set of 8 key features, including INT_DIALLED, RATIO_INT_TOTAL, and RATIO_UNIQUE_TOTAL4. Three individual models, namely Decision Tree (DT), Logistic Regression (LR), and Artificial Neural Network (ANN), were implemented alongside ensemble methods such as Bagging, Boosting, Stacking, and Voting, and adaptive models like Hoeffding Tree and Adaptive Random Forest45. The findings of this research recommend Stacking and Adaptive Random Forest (ARF) as robust tools for subscription fraud detection.

Article activity feed