Performance Analysis of Machine Learning Models with Optimized Feature Selection Techniques for Darknet Traffic Classification
Listed in
This article is not in any list yet, why not save it to one of your lists.Abstract
Darknet traffic classification is a crucial area of cybersecurity, targeting the anonymity of network activities within an anonymized network. This study evaluates the efficacy of three feature selection methods, Boruta, Recursive Feature Elimination (RFE), and Lasso, across eight machine learning models, including Random Forest, Gradient Boosting, Neural Networks, and SVM. The research enhances computational efficiency and model accuracy through rigorous data preprocessing, such as encoding, normalization, and downsampling. Boruta emerged as the most effective method, with Random Forest achieving 98.7% accuracy and 99.66% ROC-AUC, showcasing its potential for high-accuracy applications. In contrast, Lasso excelled with simpler models like Naive Bayes, optimizing it significantly, while RFE was noted for the fastest training times, which is ideal for resource-constrained environments. This comprehensive performance analysis underscores the critical role of appropriate feature selection in darknet traffic classification. It sets the stage for future research to explore deeper, more complex machine learning frameworks. The findings illustrate significant trade-offs between computational efficiency and model performance, guiding the selection of feature selection techniques based on specific application need.