GradLIME: A CNN Local Interpretation Model Based on Feature Gradient Activation

Read the full article See related articles

Discuss this preprint

Start a discussion What are Sciety discussions?

Listed in

This article is not in any list yet, why not save it to one of your lists.
Log in to save this article

Abstract

As deep learning technologies advance rapidly, there is a growing demand for greater transparency and reliability in neural network decision-making. This demand has spurred progress in the explainability of Convolutional Neural Networks (CNNs) in recent years, though significant challenges persist. Current explanation methods typically fall into two categories: those that rely entirely on the internal feature information of neural networks to construct explanations and those that use model-agnostic approaches based on visual concepts. The first approach faces limitations due to the highly abstract nature of the embedded features within neural networks and their fundamental differences from human reasoning processes, leading to inevitable deviations from human cognition. On the other hand, while model-agnostic methods can explore CNNs’ computational logic from a human-centric perspective, their independence from specific models makes it challenging to provide explanations directly linked to the network's computational structure. In some cases, these explanations may even deviate from the true underlying mechanisms of the model. To address these issues, this paper proposes a local explanation model based on feature gradient activation for CNNs, called GradLIME, which is built upon the local interpretable model-agnostic explanations (LIME) method. In the construction of the local linear explanation model, GradLIME incorporates feature gradient activation data from multiple layers of the CNN, facilitating the generation of a comprehensible local linear explanation that also fully utilises the embedded feature pertaining to the network's computational structure. Finally, experiments were conducted on standard datasets to provide qualitative and quantitative evaluations of the local explanations generated by GradLIME. The results demonstrate that, in comparison to numerous state-of-the-art explanation methods that provide visual explanations, GradLIME is more effective at distinguishing between important and unimportant features, and at extracting accurate local explanations that are easier for humans to understand in the context of CNN reasoning.

Article activity feed