Robustness of Security-Oriented LLMs under Prompt Noise and Perturbation Attacks
Discuss this preprint
Start a discussion What are Sciety discussions?Listed in
This article is not in any list yet, why not save it to one of your lists.Abstract
Prompt-level noise—such as typos, reordering, extraneous tokens, or misleading context—can destabilize LLM performance in security analysis. We present a robustness evaluation across 5 perturbation types and 9 noise levels, using 6,500 vulnerability-analysis tasks from code and configuration files. Under moderate syntactic noise, detection accuracy drops by 18–27%, and under semantic distractions it drops by 32%. To mitigate this weakness, we propose a noise-adaptive prompt transformation layer that automatically cleans, normalizes, and compresses queries before forwarding them to the LLM. The layer reduces noise-induced degradation by up to 63%, significantly improving operational reliability for real-world secure-coding workflows.