Enhancing Neural Network Interpretability Through Deep Prior-Guided Expected Gradients

Read the full article See related articles

Listed in

This article is not in any list yet, why not save it to one of your lists.
Log in to save this article

Abstract

The increasing adoption of deep neural networks (DNNs) in critical domains such as healthcare, finance, and autonomous systems underscores the growing importance of explainable artificial intelligence (XAI). In these high-stakes applications, understanding the decision-making processes of models is essential for ensuring trust and safety. However, traditional DNNs often function as "black boxes," delivering accurate predictions without providing insight into the factors driving their outputs. Expected Gradients (EG) is a prominent method for making such explanations by calculating the contribution of each input feature to the final decision. Despite its effectiveness, conventional baselines used in state-of-the-art implementations of EG often lack a clear definition of what constitutes "missing" information. In this work, we propose DeepPrior-EG, a deep prior-guided EG framework for leveraging prior knowledge to more accurately align with the concept of missingness and enhance interpretive fidelity. It resolves the baseline misalignment by initiating gradient path integration from learned prior baselines, which derived from the deep features of CNN layers. This approach not only mitigates feature absence artifacts but also amplifies critical feature contributions through adaptive gradient aggregation. We further introduce two probabilistic prior modeling strategies: a multivariate Gaussian model (MGM) to capture high-dimensional feature interdependencies and a Bayesian nonparametric Gaussian mixture model (BGMM) that autonomously infers mixture complexity for heterogeneous feature distributions. We also develop an explanation-driven model retraining paradigm to valid the robustness of the proposed framework. Comprehensive evaluations across various qualitative and quantitative metrics demonstrate the its superior interpretability. The BGMM variant achieves state-of-the-art performance in attribution quality and faithfulness against existing methods. DeepPrior-EG advances the interpretability of complex models within the XAI landscape and unlocks its potential in safety-critical applications.

Article activity feed