Addressing the Deployment Gap: Hybrid Symbolic-Statistical Vulnerability Detection in Safety-Critical C/C++ Systems
Discuss this preprint
Start a discussion What are Sciety discussions?Listed in
This article is not in any list yet, why not save it to one of your lists.Abstract
Machine learning for vulnerability detection presents a persistent paradox: while academic benchmarks report 95–97% accuracy, production deployment remains below 15% due to fundamental failures under real-world distribution shifts and asymmetric error costs. This research identifies three primary drivers of this gap: class imbalance collapse, multiclass instability across sparse categories, and brittleness under adversarial code obfuscation. To address these, we present a hybrid symbolic-statistical architecture that utilizes a staged decision system to route inputs through deterministic template matching, statistical machine learning fallback, and explicit safety-rule overrides. The system was evaluated through a large-scale empirical study on the C3-VULMAP corpus, consisting of 6.3 million C/C + + functions. Quantitative results demonstrate that the hybrid system achieves 99.11% binary accuracy with an overall false negative rate of 0.89% and an average latency of 264 ms, making it suitable for CI/CD integration. Critically, the architecture maintains robustness under adversarial conditions, yielding an 8% reduction in false negatives on obfuscated code compared to pure machine learning baselines. A formative practitioner study with seven security engineers utilizing their own production codebases found an 85% preference for the system’s pattern-based explanations over black-box confidence scores, citing increased trustworthiness for regulatory audit requirements. By trading marginal benchmark optimization for production reliability and mechanistic interpretability, this hybrid approach provides a viable pathway for deploying automated code analysis in safety-critical domains.