Interpreting BERT Using LIME and SHAP

Read the full article See related articles

Listed in

This article is not in any list yet, why not save it to one of your lists.
Log in to save this article

Abstract

Transformer-based language models such as BERT have achieved state-of-the-art performance on diverse natural language processing tasks, yet their decision processes remain opaque. This paper presents a comprehensive framework for interpreting BERT’s predictions in multi-label text classification using two leading model-agnostic explainability techniques—Local Interpretable Model-Agnostic Explanations (LIME) and SHapley Additive exPlanations (SHAP). An end-to-end pipeline for fine-tuning BERT and producing token-level attributions is introduced. We systematically compare the explainers with respect to local fidelity, global consistency, stability and computational cost. Experimental results suggest that LIME generates intuitive, case-specific explanations while SHAP provides theoretically grounded and globally consistent attributions. By integrating the complementary strengths of both methods, we propose a hybrid interpretation strategy that balances interpretability, scalability and accuracy. The methodology is illustrated through a case study on multi-label genre classification from movie plot summaries. Detailed guidelines and synthetic visualisations are provided to enable practitioners to apply these techniques effectively and responsibly.

Article activity feed