Machine Learning Approaches in Software Vulnerability Detection: A Systematic Review and Analysis of Contemporary Methods
Listed in
This article is not in any list yet, why not save it to one of your lists.Abstract
This systematic review examines the application of machine learning (ML) techniques in software vulnerability detection, focusing on their effectiveness in identifying, classifying, and mitigating security risks within source code. The review synthesizes findings from 83 studies published between 2019 and 2024, encompassing static, dynamic, and hybrid detection methods. Key objectives include categorizing ML ap-plications for specific vulnerability detection tasks, evaluating ML methodologies, com-piling relevant datasets and tools, and identifying challenges and opportunities in the field. Findings reveal a predominant reliance on static analysis techniques, supported by advanced ML models such as Graph Neural Networks (GNNs) and transformer-based Natural Language Processing (NLP) models like CodeBERT. While deep learning approaches dominate due to their ability to process large-scale and complex patterns, hybrid and traditional ML methods remain significant for contexts requiring interpretability and smaller datasets. Analysis of datasets highlights a focus on the C/C++ programming family, with substantial challenges in dataset diversity, scalability, and class imbalances. Opportunities for improvement include the integration of multilingual datasets, hybrid static-dynamic methods, and advanced architectures to enhance detection accuracy and reduce computational overhead. The review identifies the need for explainable AI, real-world validation, and user friendly tools to bridge the gap between academic research and industrial application. By addressing these challenges, future advancements in ML-based vulnerability detection can contribute to the development of scalable, interpretable, and effective solutions for modern software security.