Quantifying Hallucination Bias in AI-Generated Deepfakes: A Multimodal Analysis Using Divergence Metrics

Mohak Dwarkadhish Sharma
Anshita Bharadwaj

Read the full article

Listed in

This article is not in any list yet, why not save it to one of your lists.

Abstract

The rapid development of artificial intelligence (AI) has transformed content creation, while also introducing new challenges, particularly with AI ’hallucinations’—instances where models generate incorrect or fabricated outputs. This study hypothesizes that hallucinations, often resulting from model over-fitting, can mimic or facilitate the generation of deepfakes. We propose a novel divergence metric θ to quantitatively differentiate hallucinated outputs from those produced by deepfake models. Leveraging the FaceForensics++ dataset and a dual-model training strategy using autoencoders, we contrast the behavior of a regularized deepfake model against an overfitted hallucination-prone model. Empirical evaluation using θ-distributions, classification metrics, and t-SNE visualization reveals measurable differences in output divergence. These findings provide insight into the ethical and technical implications of model hallucination, contributing toward more robust digital forensics and detection systems.

Version published to 10.21203/rs.3.rs-6771530/v1 on Research Square
Jun 18, 2025

Listed in

Abstract

Article activity feed