Machine-Learning Classification Model and Tools for Real-time URL Phishing Detection

Ramzi Saifan
Hani Ahmad
Talal A. Edwan

Read the full article

Discuss this preprint

Start a discussion What are Sciety discussions?

Listed in

This article is not in any list yet, why not save it to one of your lists.

Abstract

Phishing attacks are considered a significant cybersecurity concern, employing deceptive tactics to entice individuals into engaging with counterfeit websites. These malicious pages are skillfully designed replicas of legitimate platforms, aiming to collect sensitive data like usernames, passwords, banking credentials, and other personal details. This study focuses on phishing via Uniform Resource Locators (URLs) and investigates the potential of machine learning to identify such deceptive websites based on their behavior and URL attributes. To accomplish this, the work introduces and demonstrates two key tools; one for dataset creation and the other for URL classification.Machine learning has already shown its effectiveness in identifying phishing attacks from URLs, though there are still some obstacles to be overcome, such as the need for vast quantities of high-quality training data and the requirement to keep up with the constantly changing tactics employed by phishing attackers. The integration of the proposed tools in a web browser plugin is supposed to enable real-time URL analysis within web browsers, enhancing the system's effectiveness against phishing attacks and hence improving user experience.Using a self-collected dataset of 46,000 URLs, several machine learning algorithms were trained and tested including support vector machine (SVM), XGBoost, decision tree, and random forest algorithms. Among these, XGBoost model achieved an impressive classification accuracy of 96%, F1-Score of 96.7%, Recall of 96.6% and Precision 96.9% after assessing various permutations of hyperparameter values using the grid search procedure. This success underscores the potency of machine learning techniques in bolstering cyber defenses and mitigating the impact of phishing attacks.

Version published to 10.21203/rs.3.rs-7666636/v1 on Research Square
Nov 11, 2025

RealPhish: An Algorithm for Real-Time Email Phishing Detection

This article has 4 authors:
1. Devendra Chapagain
2. Naresh Kshetri
3. Bishnu Bhusal
4. Pradip Subedi
This article has no evaluationsLatest version Nov 5, 2025
A Novel GRU-Attention Framework with Adaptive Authentication for Robust Phishing Attack Detection and Secure Data Transfer

This article has 5 authors:
1. Qaisar Abbas
2. Mubarak Albathan
3. Imran Qureshi
4. Mutlaq B. Aldajani
5. Amjad Ali Naz
This article has no evaluationsLatest version Nov 10, 2025
A Comparative Analysis of Deep Learning and Machine Learning Approaches for Spam Identification on Telegram

This article has 7 authors:
1. Shuo Xu
2. Zhanyi Ding
3. Zijing Wei
4. Chao Yang
5. Yixiang Li
6. Xuanjie Chen
7. Hailiang Wang
This article has no evaluationsLatest version Oct 28, 2025

Discuss this preprint

Listed in

Abstract

Article activity feed

Related articles

RealPhish: An Algorithm for Real-Time Email Phishing Detection

A Novel GRU-Attention Framework with Adaptive Authentication for Robust Phishing Attack Detection and Secure Data Transfer

A Comparative Analysis of Deep Learning and Machine Learning Approaches for Spam Identification on Telegram