Research on the Vulnerability Identification Efficiency of Enhanced Reverse-Analyzed LLM Model in Binary Program Fuzz Testing

Read the full article See related articles

Discuss this preprint

Start a discussion What are Sciety discussions?

Listed in

This article is not in any list yet, why not save it to one of your lists.
Log in to save this article

Abstract

In response to the problems of low coverage, high blindness and missed detection of complex vulnerabilities in traditional fuzz testing for binary programs, this paper proposes an enhanced fuzz testing framework based on reverse analysis and large language model (LLM). This framework extracts control flow graphs and key path features through static reverse analysis, and captures input dependency trajectories through dynamic reverse analysis. These are then transformed into structured text input for LLM; the LLM, fine-tuned with binary vulnerability data, predicts high-risk areas and guides the fuzz testing module to generate targeted test cases. Experiments on the LAVA-M benchmark set and 10 actual closed-source binary programs show that this framework improves the vulnerability discovery rate by 34.2% compared to AFL, and shortens the average vulnerability discovery time by 28.7%. Ablation experiments verify the improvement effect of reverse analysis features on the semantic understanding of LLM (F1 value increases by 19.5%). The research confirms that the collaboration of reverse analysis and LLM can effectively enhance the vulnerability identification efficiency of binary program fuzz testing.

Article activity feed