Explainable Hallucination Mitigation in Large Language Models: A Survey

Read the full article See related articles

Listed in

This article is not in any list yet, why not save it to one of your lists.
Log in to save this article

Abstract

Hallucinations in large language models (LLMs) pose significant challenges to their reliability, especially in knowledge-intensive and reasoning-oriented tasks. While existing efforts have focused on detecting or correcting such errors, they often lack a unified interpretive framework for understanding the underlying causes and designing principled mitigation strategies. This survey investigates hallucination mitigation through the perspective of explainability, organizing a taxonomy that differentiates between internal and post-hoc interpretability methods. We analyze the role of techniques such as attribution tracing, reasoning path construction, and prompt-based verification in enabling both transparent diagnosis and structured intervention. Additionally, we reflect on the constructive role of hallucinations in creativity-driven and user-centered applications, and suggest that context-aware control may be more appropriate than universal suppression. By consolidating recent research, this survey advocates for the integration of explainability into the development of more transparent, controllable, and trustworthy language generation systems.

Article activity feed