Research on the Vulnerability Identification Efficiency of Enhanced Reverse-Analyzed LLM Model in Binary Program Fuzz Testing

Shiyin Lin

Read the full article

Discuss this preprint

Start a discussion What are Sciety discussions?

Listed in

This article is not in any list yet, why not save it to one of your lists.

Abstract

In response to the problems of low coverage, high blindness and missed detection of complex vulnerabilities in traditional fuzz testing for binary programs, this paper proposes an enhanced fuzz testing framework based on reverse analysis and large language model (LLM). This framework extracts control flow graphs and key path features through static reverse analysis, and captures input dependency trajectories through dynamic reverse analysis. These are then transformed into structured text input for LLM; the LLM, fine-tuned with binary vulnerability data, predicts high-risk areas and guides the fuzz testing module to generate targeted test cases. Experiments on the LAVA-M benchmark set and 10 actual closed-source binary programs show that this framework improves the vulnerability discovery rate by 34.2% compared to AFL, and shortens the average vulnerability discovery time by 28.7%. Ablation experiments verify the improvement effect of reverse analysis features on the semantic understanding of LLM (F1 value increases by 19.5%). The research confirms that the collaboration of reverse analysis and LLM can effectively enhance the vulnerability identification efficiency of binary program fuzz testing.

Version published to 10.20944/preprints202510.1106.v1
Oct 14, 2025

Hybrid Fuzzing with LLM-Guided Input Mutation and Semantic Feedback

This article has 1 author:
1. Shiyin Lin
This article has no evaluationsLatest version Sep 23, 2025
LLM-Driven Adaptive Source–Sink Identification and False Positive Mitigation for Static Analysis

This article has 1 author:
1. Shiyin Lin
This article has no evaluationsLatest version Sep 10, 2025
Automatic Source Code Vulnerability Detection, Classification, and Prioritization Using Deep Learning

This article has 2 authors:
1. Melese Awoke
2. Yitayal Tehone
This article has no evaluationsLatest version Sep 9, 2025

Discuss this preprint

Listed in

Abstract

Article activity feed

Related articles

Hybrid Fuzzing with LLM-Guided Input Mutation and Semantic Feedback

LLM-Driven Adaptive Source–Sink Identification and False Positive Mitigation for Static Analysis

Automatic Source Code Vulnerability Detection, Classification, and Prioritization Using Deep Learning