Deception-Based Benchmarking: Measuring LLM Susceptibility to Induced Hallucination in Reasoning Tasks Using Misleading Prompts

Rukun Dou

Read the full article

Listed in

This article is not in any list yet, why not save it to one of your lists.

Abstract

We present a novel benchmarking methodology for Large Language Models (LLMs) to evaluate their susceptibility to hallucinations, thereby determining their reliability for real-world applications involving greater responsibilities. This method, called Deception-Based Benchmarking, involves testing the model with a task that requires composing a short paragraph. Initially, the model performs under standard conditions. Then, it is required to begin with a misleading sentence. Based on these outputs, the model is assessed on three criteria: accuracy, susceptibility, and consistency. This approach can be integrated with existing benchmarks or applied to new ones, thus facilitating a comprehensive evaluation of models across multiple dimensions. It also encompasses various forms of hallucination. We applied this methodology to several small opensource models using a modified version of MMLU, DB-MMLU1 . Our findings indicate that most current models are not specifically designed to self-correct when the random sampling process leads them to produce inaccuracies. However, certain models, such as Solar-10.7B-Instruct, exhibit a reduced vulnerability to hallucination, as reflected by their susceptibility and consistency scores. These metrics are distinct from traditional benchmark scores. Our results align with TruthfulQA, a widely used benchmark for hallucination. Looking forward, DB-benchmarking can be readily applied to other benchmarks to monitor the advancement of LLMs.

Version published to 10.20944/preprints202407.0120.v1
Jul 2, 2024

Explainable Hallucination Mitigation in Large Language Models: A Survey

This article has 7 authors:
1. Wentao Deng
2. Jiao Li
3. Hong-Yu Zhang
4. Jiuyong Li
5. Zhenyun Deng
6. Debo Cheng
7. Zaiwen Feng
This article has no evaluationsLatest version May 7, 2025
Mitigating LLM Hallucinations: A Comprehensive Review of Techniques and Architectures

This article has 1 author:
1. Satyadhar Joshi
This article has no evaluationsLatest version May 26, 2025
A Survey on Hallucination in Large Language and Foundation Models

This article has 2 authors:
1. Pegah Ahadian
2. Qiang Guan
This article has no evaluationsLatest version May 13, 2025

Listed in

Abstract

Article activity feed

Related articles

Explainable Hallucination Mitigation in Large Language Models: A Survey

Mitigating LLM Hallucinations: A Comprehensive Review of Techniques and Architectures

A Survey on Hallucination in Large Language and Foundation Models