Machine Learning for Lateral Movement Detection using Sysmon Logs: An Empirical Comparison of Imbalanced and Resampled Data
Listed in
This article is not in any list yet, why not save it to one of your lists.Abstract
Lateral Movement (LM) represents a growing threat, frequently employed by advanced persistent threat groups to escalate privileges and navigate systems towards high-value assets. Recognizing the limitations of existing literature, this work leverages the unique LM-focused LMD-2023 imbalanced benchmark dataset, which comprises Microsoft Windows Sysmon logs, to provide a multifaceted novel contribution to the LM Intrusion Detection System (IDS) domain. We investigate the impact of various open-source over-undersampling balancing techniques on the performance of LM IDS frameworks. Specifically, we address the research question: How does the sample distribution within a benchmark dataset affect the performance evaluation metrics of LM-oriented IDS models, either shallow or DNN? To this end, we adopt a multiclass supervised approach, classifying network activity into Normal, Exploitation of Remote Services, and Exploitation of Hashing Techniques. We scrutinize the effect of oversampling, undersampling, and hybrid-sampling techniques across 13 machine learning algorithms, nine shallow and four deep neural network techniques, using the LMD-2023 corpus. Our key findings reveal that balanced versions of the dataset generally improved performance. Shallow models trained on resampled data achieved a marginal convergence of approximately +0.05% in AUC and F1-score compared to the imbalanced scenario. Notably, DNN models exhibited a more substantial performance gain of around 3.5% across most balancing techniques. Furthermore, analysis of False Positive Rate (FPR) and False Negative Rate (FNR) revealed crucial trade-offs. Notably, while some balanced datasets led to near-zero FNR with ensemble methods like Bagging, others, particularly with DNNs and techniques like ADASYN, showed a higher propensity for false alarms. These observations underscore the critical role of data balancing in optimizing LM IDS performance and highlight the varying impact of different techniques on the FPR/FNR trade-off for shallow versus deep learning models.