Training-Free Counterfactual Hallucination Mitigation Method for Large Vision-Language Models
Discuss this preprint
Start a discussion What are Sciety discussions?Listed in
This article is not in any list yet, why not save it to one of your lists.Abstract
Despite the remarkable progress enabled by large vision-language models (LVLMs) in visual question answering, these models remain vulnerable to object hallucination—a critical issue where generated answers reflect plausible but nonexistent visual elements. This phenomenon stems from LVLMs’ overreliance on linguistic priors and training-set biases, undermining their trustworthiness in real-world applications. To tackle this challenge, we propose CounterfactualLVLM, a novel, training-free and plug-and-play framework that mitigates object hallucinations via small-model-assisted counterfactual reasoning. The core idea is to explicitly contrast how LVLMs behave when visually salient objects are deliberately removed. To achieve this, we design an unbiased small-scale VQA model that locates the most semantically influential objects in an image for a given question. These key objects are masked to generate counterfactual samples, which simulate targeted visual uncertainty. We then introduce a visual contrastive decoding mechanism that subtracts the hallucination-prone output distribution (from counterfactual input) from the original one, thereby suppressing hallucinated predictions without any fine-tuning or parameter modification. Extensive experiments across four challenging benchmarks (POPE,VQA-CP, OK-VQA) and multiple LVLMs (LLaVA, Qwen-VL, InstructBLIP) show that CounterfactualLVLM significantly reduces object hallucinations and improves VQA accuracy—achieving consistent gains under resource-constrained settings. Our work highlights the power of counterfactual guidance as a simple yet effective paradigm for enhancing factual grounding in LVLM-based multi-modal reasoning.