Machine-Learning Classification Model and Tools for Real-time URL Phishing Detection

Read the full article See related articles

Discuss this preprint

Start a discussion What are Sciety discussions?

Listed in

This article is not in any list yet, why not save it to one of your lists.
Log in to save this article

Abstract

Phishing attacks are considered a significant cybersecurity concern, employing deceptive tactics to entice individuals into engaging with counterfeit websites. These malicious pages are skillfully designed replicas of legitimate platforms, aiming to collect sensitive data like usernames, passwords, banking credentials, and other personal details. This study focuses on phishing via Uniform Resource Locators (URLs) and investigates the potential of machine learning to identify such deceptive websites based on their behavior and URL attributes. To accomplish this, the work introduces and demonstrates two key tools; one for dataset creation and the other for URL classification.Machine learning has already shown its effectiveness in identifying phishing attacks from URLs, though there are still some obstacles to be overcome, such as the need for vast quantities of high-quality training data and the requirement to keep up with the constantly changing tactics employed by phishing attackers. The integration of the proposed tools in a web browser plugin is supposed to enable real-time URL analysis within web browsers, enhancing the system's effectiveness against phishing attacks and hence improving user experience.Using a self-collected dataset of 46,000 URLs, several machine learning algorithms were trained and tested including support vector machine (SVM), XGBoost, decision tree, and random forest algorithms. Among these, XGBoost model achieved an impressive classification accuracy of 96%, F1-Score of 96.7%, Recall of 96.6% and Precision 96.9% after assessing various permutations of hyperparameter values using the grid search procedure. This success underscores the potency of machine learning techniques in bolstering cyber defenses and mitigating the impact of phishing attacks.

Article activity feed