As good as chance: A systematic review and meta-analysis of human deepfake detection performance based on 56 papers
Listed in
This article is not in any list yet, why not save it to one of your lists.Abstract
Deepfakes are artificial intelligence-generated content created to be perceived as real, typically created to deceive. Deepfakes threaten public and personal safety through misuse in disinformation, propaganda, and identity theft. Though research has been conducted on human performance in deepfake detection, the results have not yet been synthesized. This systematic review and meta-analysis investigates human deepfake detection. In June and October 2024, PubMed, ScienceGov, Jstor, Google Scholar, and paper references were searched. Inclusion criteria were studies conducting novel empirical research measuring human detection performance using high-quality deepfakes. Pooling accuracy, odds ratio, and sensitivity index (d') effect sizes (k = 137 effects) from 56 papers (86,155 participants), we investigate 1) general deepfake detection performance, 2) performance for different stimulus modalities (audio, image, text, and video), and 3) the effects of strategies for improving performance. Deepfake detection performance is consistently at chance level across audio (62.08% [38.23, 83.18], k = 8), image (53.16% [42.12, 64.64], k = 18), text (52.00% [37.42, 65.88], k = 15), and video (57.31% [47.80, 66.57], k = 26) deepfakes (total: 55.54% [48.87, 62.10], k = 67). Odds ratios (OR = 0.64 [0.52, 0.79], k = 62) indicate that participants have a 39% chance of detecting deepfakes, which is worse than chance. However, strategies like feedback training, AI support, and deepfake caricaturization increase detection performance (65.14% [55.21, 74.46], k = 15), especially for video stimuli.