A Study on Explainable Artificial Intelligence(XAI) in Malware Detection for Proactive Cyber Threat Hunting

Read the full article See related articles

Discuss this preprint

Start a discussion What are Sciety discussions?

Listed in

This article is not in any list yet, why not save it to one of your lists.
Log in to save this article

Abstract

In recent years, effective malware detection requires Machine Learning models that are both accurate and interpretable. While high-performing models often suffer from poor explainability, this study addresses this gap by integrating advanced Explainable Artificial Intelligence (XAI) frameworks with well-established ML algorithms to create transparent, trustworthy detection systems. We evaluate Decision Tree, Random Forest, XGBoost, Naïve Bayes, and Kernel SVM using SHAP (SHapley Additive exPlanations) for feature importance assessment. Models with limited interpretability (Kernel SVM, Decision Tree) are excluded, while remaining models undergo ELI5 permutation importance validation to confirm feature rankings and decision logic. XGBoost emerges as optimal due to its superior accuracy, superior handling of complex non-linear relationships, and stable, reproducible explanations. We apply advanced XAI techniques—Partial Dependence Plots (PDP), Individual Conditional Expectation (ICE) plots, and 2D Accumulated Local Effects (2D-ALE)—to reveal global and local trends, feature interactions, and non-linear patterns within the model's most influential features. This analysis demonstrates that tree-based ensemble models, when paired with rigorous XAI techniques, yield transparent and operationally trustworthy tools for cybersecurity applications. The result is a methodology that bridges high predictive performance with actionable intelligence, enabling security practitioners to validate and deploy ML-based malware classifiers with confidence.

Article activity feed