A Study on Improving the Automatic Classification Performance of Cybersecurity MITRE ATT&CK Tactics Using NLP-Based ModernBERT and BERTopic Models

Jaehwan Baek
Jeonghoon O
Seungwoo Jeong
Wooju Kim

Read the full article

Discuss this preprint

Start a discussion What are Sciety discussions?

Listed in

This article is not in any list yet, why not save it to one of your lists.

Abstract

Cyber Threat Intelligence (CTI) reports are essential resources for identifying the Tactics, Techniques, and Procedures (TTPs) of hackers and cyber threat actors. However, these reports are often lengthy and unstructured, which limits their suitability for automatic mapping to the MITRE ATT&CK framework. This study designs and compares five hybrid classification models that combine statistical features (TF-IDF), transformer-based contextual embeddings (BERT and ModernBERT), and topic-level representations (BERTopic) to automatically classify CTI reports into 12 ATT&CK tactic categories. Experiments using the rcATT dataset, consisting of 1490 public threat reports, show that the model integrating TF-IDF and ModernBERT achieved a micro-precision of 72.25%, reflecting a 10.07-percentage-point improvement in detection precision compared with the baseline. The model combining TF-IDF and BERTopic achieved a micro F0.5 of 67.14% and a macro F0.5 of 63.20%, demonstrating balanced performance across both frequent and rare tactic classes. These findings indicate that integrating statistical, contextual, and semantic representations can improve the balance between precision and recall while enabling clearer interpretation of model outputs in multi-label CTI classification. Furthermore, the proposed model shows potential applicability for improving detection efficiency and reducing analyst workload in Security Operations Center (SOC) environments.

Version published to 10.3390/electronics14224434
Nov 13, 2025
Version published to 10.20944/preprints202510.1543.v1
Oct 20, 2025

A Study on Explainable Artificial Intelligence(XAI) in Malware Detection for Proactive Cyber Threat Hunting

This article has 3 authors:
1. Pankaj Gajakosh S.
2. Rama Abirami K.
3. Nagendra Kumar Y. J.
This article has no evaluationsLatest version Dec 23, 2025
Evaluating Adversarial Robustness of AI Intrusion Detection Systems Using Automated Traffic Generation

This article has 2 authors:
1. Samer Aoudi
2. Hussain Al-Aqrabi
This article has no evaluationsLatest version Dec 26, 2025
Integrated Risk Scoring and Exploit Prediction for Cyber-Physical Power System Vulnerabilities

This article has 4 authors:
1. Firdous Kausar
2. Lisette Batiste
3. Asmah Muallem
4. Sajid Hussain
This article has no evaluationsLatest version Dec 30, 2025

Discuss this preprint

Listed in

Abstract

Article activity feed

Related articles

A Study on Explainable Artificial Intelligence(XAI) in Malware Detection for Proactive Cyber Threat Hunting

Evaluating Adversarial Robustness of AI Intrusion Detection Systems Using Automated Traffic Generation

Integrated Risk Scoring and Exploit Prediction for Cyber-Physical Power System Vulnerabilities