A Comparative Analysis of Machine Learning Models for URL-Based Phishing Detection

Rafi MRM
Nuski F.A.M
Suhaif A.M
Shaminda K.A.S

Read the full article

Listed in

This article is not in any list yet, why not save it to one of your lists.

Abstract

Phishing attacks pose a significant and ongoing cybersecurity threat, necessitating effective countermeasures. The challenge lies in accurately and automatically detecting malicious URLs, as traditional methods often fall short against evolving attacker techniques. This research addresses the need for improved detection by evaluating machine learning approaches applied to URL analysis. A dataset of labeled phishing and legitimate URLs, characterized by 30 distinct features encompassing lexical, host-based, and content-related attributes, formed the basis of this study. Five machine learning models were trained and comparatively evaluated: Decision Tree (DT), Random Forest (RF), Support Vector Machine (SVM), XGBoost (XGB), and a Stacking Classifier ensemble. Performance analysis revealed that the XGBoost classifier achieved the highest accuracy, correctly classifying approximately 97.4% of URLs in the test set. This study demonstrates the effectiveness of machine learning, particularly XGBoost, for high-accuracy phishing URL detection using comprehensive feature sets and contributes a functional prototype system demonstrating the approach.

Version published to 10.21203/rs.3.rs-6439154/v1 on Research Square
Apr 15, 2025

Advanced Techniques in Phishing Detection Machine Learning Approaches and Their Effectiveness

This article has 1 author:
1. Betty Heleen
This article has no evaluationsLatest version May 6, 2025
Comprehensive Evaluation of Machine Learning Algorithms for Intrusion Detection: A Focus on Binary Logistic Regression

This article has 2 authors:
1. Owen Graham
2. Max Mckenzie
This article has no evaluationsLatest version May 5, 2025
Emergent Threat Discovery: Unsupervised Machine Learning for Phishing Campaign Analysis

This article has 2 authors:
1. Muhammad Fahad Zia
2. Sri Harish Kalidass
This article has no evaluationsLatest version May 23, 2025

Listed in

Abstract

Article activity feed

Related articles

Advanced Techniques in Phishing Detection Machine Learning Approaches and Their Effectiveness

Comprehensive Evaluation of Machine Learning Algorithms for Intrusion Detection: A Focus on Binary Logistic Regression

Emergent Threat Discovery: Unsupervised Machine Learning for Phishing Campaign Analysis