Leveraging Predominant Lexical Features to Enhance Malicious URL Detection for Cybersecurity Sustainability

Nor Hasliza Abdul Hamid
Sharifah Md Ya

Read the full article

Discuss this preprint

Start a discussion What are Sciety discussions?

Listed in

This article is not in any list yet, why not save it to one of your lists.

Abstract

The Malaysia’s Government purpose to provide great services to its citizens. Information delivery via website is a supplying method that enables citizens to reassess perceptions of the Government's reliability. However, attackers create an identical website to exploit weakness on the webpage. They attempt to deceive the victim by clicking same website in order to obtain the victim's information or control their computer. According to Google's Transparency Report, 2.195 million websites were classified as "Sites Deemed Dangerous by Safe Browsing" on January 17, 2021, with 2.1 million of them being phishing sites. This study aims to improve the accuracy of malicious URL detection via a machine learning model by optimizing feature selection and extraction of lexical features through RFI, SFM, and N-Gram techniques. This study also seeks to develop a model that can improve imbalance dataset by concentrating on raising the quality of the data in order to achieve more effective malicious detection. This proposed an enhancing feature detection process in malicious URL detection that would focusing on improving the detection accuracy and faster detection that contributing the detection of malicious URLs based on lexical features. In this study, performance evaluation metrics like accuracy, precision, f-score, and recall are utilized to compare the findings. In conclusion, this study found that utilizing lexical characteristics and a machine learning model produced promising results in detecting harmful URLs and effectively distinguishing between benign and dangerous URLs.

Version published to 10.21203/rs.3.rs-7395330/v1 on Research Square
Sep 3, 2025

Phishing Attack Detection and Secure Data Transfer Using Echo State Networks and Federated Identity Management

This article has 2 authors:
1. Khalil El Hindi
2. Mohammed A. El-Meligy
This article has no evaluationsLatest version Dec 29, 2025
A Binary Genetic Harris Hawks Optimization With Machine Learning on Detection of Phishing Url

This article has 2 authors:
1. Ponni Ponnusamy
2. Priyadharsini Ganesan
This article has no evaluationsLatest version Dec 12, 2025
Systematic Detection of Layering Instances for Real-Time Anomaly Detection of Financial Crimes

This article has 3 authors:
1. Muhammad Nuraddeen Ado
2. Jabir Isah Karofi
3. Hamisu Mukhtar
This article has no evaluationsLatest version Jan 22, 2026

Discuss this preprint

Listed in

Abstract

Article activity feed

Related articles

Phishing Attack Detection and Secure Data Transfer Using Echo State Networks and Federated Identity Management

A Binary Genetic Harris Hawks Optimization With Machine Learning on Detection of Phishing Url

Systematic Detection of Layering Instances for Real-Time Anomaly Detection of Financial Crimes