Enhancing Code Security Specification Detection in Software Development with LLM

Read the full article See related articles

Listed in

This article is not in any list yet, why not save it to one of your lists.
Log in to save this article

Abstract

In order to improve the accuracy and efficiency of code security specification detection in the software development process, a large-scale language model (LLM)-based detection architecture is constructed to realize deep semantic parsing and structured rule matching for high-complexity code in multilingual environments. The research designs a security specification controlled language system integrating CFG and AST, covering 62 rule items, 480 types of semantic labels and 3000 AST nodes, and introduces a quintuple language model G = (N, Σ, P, S, C) to construct five types of key constraint domains, including variable naming, privilege boundaries, input validation and so on. The structural semantic graph with 2.1×10⁶ edges is constructed by graph neural network, and a deep detection model containing 690 million parameters is designed to support sliding window scanning and semantic embedding matching, adapting to 45 kinds of security constraint structures. The experiments are tested in 1.38 million function fragments, and the detection accuracy is more than 93%, among which the accuracy of memory access out-of-bounds class reaches 95.6%, which verifies the high adaptability and stability of the model in the aspects of function privilege control, input validation and memory security.

Article activity feed