VulD-SG: Enhancing code vulnerability detection via combining deep sequence and graph model

Read the full article See related articles

Listed in

This article is not in any list yet, why not save it to one of your lists.
Log in to save this article

Abstract

Thriving and widespread use of the open-source software community makes software vulnerabilities spread a lot, which brings serious challenges to the system security. Recently, a number of vulnerability detection methods based on deep learning have been proposed to help engineers analyze and patch vulnerabilities efficiently. However, these existing approaches still suffer from limitations in extracting rich features from vulnerability code. Aiming at the above problems, we propose VulD-SG, a dual-channel software code vulnerability detection method based on deep sequence and graph model. VulD-SG enhances the semantic, syntactic and structural features extraction ability of the source code by introducing the deep sequence-based and graph-based vulnerability feature extraction module. To address the problem of the coarse detection granularity in the traditional methods, VulD-SG slices code statements into subtokens with a new decomposition algorithm to capture the detailed vulnerability information. Meanwhile, Transformer-style encoder is utilized in graph-based vulnerability feature extraction module to aggregate program dependency graph (PDG) nodes to learn the long-range dependence of cross-function code effectively. Finally, we build a fusion model to merge the training parameters and achieve fine-grained prediction results. The experiments result show that Acc, F1, and Recall metrics were improved by 2.6\%~27\%, 2\%~29.2\%, and 1\%~30.25\% respectively on five different vulnerability datasets compared with seven vulnerability detection models based on deep learning.

Article activity feed