Toward Reproducible Cyberattack Reconstruction: A Semi-Automated Framework Leveraging Standardized CTI Reports
Listed in
This article is not in any list yet, why not save it to one of your lists.Abstract
Cybersecurity is becoming more sophisticated, which critically endangers infrastructure and data integrity. Within this context, attack reconstruction emerges as a pivotal methodology for understanding attack chains and enhancing defensive capabilities. Conventional reconstruction approaches, however, remain constrained by limited reproducibility and excessive reliance on expert knowledge, hindering systematic and precise reproduction of attack processes. Furthermore, Cyber Threat Intelligence (CTI) reports predominantly exist as unstructured narratives plagued by inconsistent terminology and semantic ambiguities, obstructing automated analysis and attack reconstruction. \\This paper presents a semi-automated framework for cyberattack reconstruction that systematically analyzes unstructured CTI reports into executable adversarial workflows. The framework is initiated by standardizing heterogeneous CTI reports using an attack language standardization, which encodes Tactics-Techniques-Procedures (TTP) to address semantic inconsistencies and facilitate interoperability across organizations. Utilizing these structured representations, the framework employs scenario retrieval to identify similar attack sequences through feature matching. Based on this, attack processes are decomposed into atomic techniques aligned with the MITRE ATT\&CK framework, each with explicit semantic definitions. Finally, the system dynamically orchestrates execution environments by inferring configurations from parsed attack parameters, automating infrastructure provisioning to ensure reproducibility and reduce manual setup efforts. \\We evaluate three real-world attack campaigns and the DARPA Transparent Computing Red Team scenario against adversarial techniques, revealing multiple implementation variants across attack methodologies. For instance, certain techniques exhibit up to 70 distinct implementations. Our proposed framework significantly reduces manual analysis workloads by approximately 40% to 60% while maintaining a high degree of fidelity and reproducibility. To further assess the reconstruction effectiveness, we use the Cobalt attack case as an example. Empirical results show that out of 77 lines of attack descriptions, only 8 required modifications, with manual involvement accounting for merely 10%.