Quantifying Hallucination Bias in AI-Generated Deepfakes: A Multimodal Analysis Using Divergence Metrics
Listed in
This article is not in any list yet, why not save it to one of your lists.Abstract
The rapid development of artificial intelligence (AI) has transformed content creation, while also introducing new challenges, particularly with AI ’hallucinations’—instances where models generate incorrect or fabricated outputs. This study hypothesizes that hallucinations, often resulting from model over-fitting, can mimic or facilitate the generation of deepfakes. We propose a novel divergence metric θ to quantitatively differentiate hallucinated outputs from those produced by deepfake models. Leveraging the FaceForensics++ dataset and a dual-model training strategy using autoencoders, we contrast the behavior of a regularized deepfake model against an overfitted hallucination-prone model. Empirical evaluation using θ-distributions, classification metrics, and t-SNE visualization reveals measurable differences in output divergence. These findings provide insight into the ethical and technical implications of model hallucination, contributing toward more robust digital forensics and detection systems.