Machine Learning Approaches in Software Vulnerability Detection: A Systematic Review and Analysis of Contemporary Methods

Jude E. Ameh
Abayomi Otebolaku
Alex Shenfield
Augustine Ikpehai
Dauda Sule

Read the full article

Listed in

This article is not in any list yet, why not save it to one of your lists.

Abstract

This systematic review examines the application of machine learning (ML) techniques in software vulnerability detection, focusing on their effectiveness in identifying, classifying, and mitigating security risks within source code. The review synthesizes findings from 83 studies published between 2019 and 2024, encompassing static, dynamic, and hybrid detection methods. Key objectives include categorizing ML ap-plications for specific vulnerability detection tasks, evaluating ML methodologies, com-piling relevant datasets and tools, and identifying challenges and opportunities in the field. Findings reveal a predominant reliance on static analysis techniques, supported by advanced ML models such as Graph Neural Networks (GNNs) and transformer-based Natural Language Processing (NLP) models like CodeBERT. While deep learning approaches dominate due to their ability to process large-scale and complex patterns, hybrid and traditional ML methods remain significant for contexts requiring interpretability and smaller datasets. Analysis of datasets highlights a focus on the C/C++ programming family, with substantial challenges in dataset diversity, scalability, and class imbalances. Opportunities for improvement include the integration of multilingual datasets, hybrid static-dynamic methods, and advanced architectures to enhance detection accuracy and reduce computational overhead. The review identifies the need for explainable AI, real-world validation, and user friendly tools to bridge the gap between academic research and industrial application. By addressing these challenges, future advancements in ML-based vulnerability detection can contribute to the development of scalable, interpretable, and effective solutions for modern software security.

Version published to 10.21203/rs.3.rs-5975490/v1 on Research Square
Mar 21, 2025

GPTVD: Vulnerability Detection and Analysis Method Based on LLM's Chain of Thoughts.

This article has 5 authors:
1. Yinan Chen
2. Yuan Huang
3. Xiangping Chen
4. Pengfei Shen
5. Lei Yun
This article has no evaluationsLatest version Apr 21, 2025
Artificial Intelligence in Cybersecurity: A Comprehensive Analysis of Machine Learning Applications on Phishing URLs and Malware API Calls for Devising Cyber Defense Strategies

This article has 2 authors:
1. Bekim Fetaji
2. Debabrata Samanta
This article has no evaluationsLatest version Mar 10, 2025
MalGTA: Large language Model-based Guided Malware Tactical Analysis

This article has 5 authors:
1. Wenjie Guo
2. Jingfeng Xue
3. Zeyang Liu
4. Weijie Han
5. Jingjing Hu
This article has no evaluationsLatest version Mar 31, 2025

Listed in

Abstract

Article activity feed

Related articles

GPTVD: Vulnerability Detection and Analysis Method Based on LLM's Chain of Thoughts.

Artificial Intelligence in Cybersecurity: A Comprehensive Analysis of Machine Learning Applications on Phishing URLs and Malware API Calls for Devising Cyber Defense Strategies

MalGTA: Large language Model-based Guided Malware Tactical Analysis