Enhancing Neural Network Interpretability Through Deep Prior-Guided Expected Gradients

Su-Ying Guo
Xiu-Jun Gong

Read the full article

Listed in

This article is not in any list yet, why not save it to one of your lists.

Abstract

The increasing adoption of deep neural networks (DNNs) in critical domains such as healthcare, finance, and autonomous systems underscores the growing importance of explainable artificial intelligence (XAI). In these high-stakes applications, understanding the decision-making processes of models is essential for ensuring trust and safety. However, traditional DNNs often function as "black boxes," delivering accurate predictions without providing insight into the factors driving their outputs. Expected Gradients (EG) is a prominent method for making such explanations by calculating the contribution of each input feature to the final decision. Despite its effectiveness, conventional baselines used in state-of-the-art implementations of EG often lack a clear definition of what constitutes "missing" information. In this work, we propose DeepPrior-EG, a deep prior-guided EG framework for leveraging prior knowledge to more accurately align with the concept of missingness and enhance interpretive fidelity. It resolves the baseline misalignment by initiating gradient path integration from learned prior baselines, which derived from the deep features of CNN layers. This approach not only mitigates feature absence artifacts but also amplifies critical feature contributions through adaptive gradient aggregation. We further introduce two probabilistic prior modeling strategies: a multivariate Gaussian model (MGM) to capture high-dimensional feature interdependencies and a Bayesian nonparametric Gaussian mixture model (BGMM) that autonomously infers mixture complexity for heterogeneous feature distributions. We also develop an explanation-driven model retraining paradigm to valid the robustness of the proposed framework. Comprehensive evaluations across various qualitative and quantitative metrics demonstrate the its superior interpretability. The BGMM variant achieves state-of-the-art performance in attribution quality and faithfulness against existing methods. DeepPrior-EG advances the interpretability of complex models within the XAI landscape and unlocks its potential in safety-critical applications.

Version published to 10.20944/preprints202503.2355.v1
Mar 31, 2025

Causally-informed Deep Learning towards Explainable and Generalizable Outcomes Prediction in Critical Care

This article has 7 authors:
1. Yuxiao Cheng
2. Xinxin Song
3. Ziqian Wang
4. Qin Zhong
5. Qionghai Dai
6. Kunlun He
7. Jinli Suo
This article has no evaluationsLatest version Mar 20, 2025
Exploring Neural Ordinary Differential Equations as Interpretable Healthcare classifiers

This article has 1 author:
1. Shi Li
This article has no evaluationsLatest version Mar 10, 2025
A Lightweight Explainability Framework for Neural Networks: Methods, Benchmarks, and Mobile Deployment

This article has 1 author:
1. Yin Li
This article has no evaluationsLatest version Mar 26, 2025

Listed in

Abstract

Article activity feed

Related articles

Causally-informed Deep Learning towards Explainable and Generalizable Outcomes Prediction in Critical Care

Exploring Neural Ordinary Differential Equations as Interpretable Healthcare classifiers

A Lightweight Explainability Framework for Neural Networks: Methods, Benchmarks, and Mobile Deployment