Voice clones sound realistic but not (yet) hyperrealistic

Nadine Lavan
Mairi Irvine
Victor Rosi
Carolyn McGettigan

Read the full article

Discuss this preprint

Start a discussion What are Sciety discussions?

Listed in

This article is not in any list yet, why not save it to one of your lists.

Abstract

AI-generated voices are increasingly prevalent in our lives, via virtual assistants, automated customer service, and voice-overs. With increased availability and affordability of AI-generated voices, we need to examine how humans perceive them. Recently, an intriguing effect was reported in AI-generated faces, where such face images were perceived as more human than images of real humans - a “hyperrealism effect.” Here, we tested whether a “hyperrealism effect” also exists for AI-generated voices. We investigated the extent to which AI-generated voices sound real to human listeners, and whether listeners can accurately distinguish between human and AI-generated voices. We also examined perceived social trait characteristics (trustworthiness and dominance) of human and AI-generated voices. We tested these questions using AI-generated voices generated with and without a specific human counterpart (i.e., voice clones, and voices generated from the latent space of a large voice model).We find that voice clones can sound as real as human voices, making it difficult for listeners to distinguish between them. However, we did not observe a hyperrealism effect. Both types of AI-generated voices were evaluated as more dominant than human voices, with some AI-generated voices also being perceived as more trustworthy. These findings raise questions for future research: Can hyperrealistic voices be created with more advanced technology, or is the lack of a hyperrealism effect due to differences between voice and face (image) perception? Our findings also highlight the potential for AI-generated voices to misinform and fraud, alongside opportunities to use realistic AI-generated voices for beneficial purposes.

Version published to 10.31234/osf.io/jqg6e_v2 on OSF Preprints
Aug 19, 2025
Version published to 10.31234/osf.io/jqg6e on OSF Preprints
Nov 25, 2024

Stimulus Equivalence in AI-Generated Anime Characters: Equivalence Responding Among Voices, Images, and Names

This article has 1 author:
1. Satoru Shimamunue
This article has no evaluationsLatest version Dec 30, 2025
Perceptual adaptation and transfer of learning for noise‑vocoded cloned and human voices

This article has 3 authors:
1. Han Wang
2. Carolyn McGettigan
3. Patti Adank
This article has no evaluationsLatest version Feb 5, 2026
Title: AI, Autism, and the Architecture of Voice: From Engineered Exclusion to Designed Dignity

This article has 1 author:
1. Hari Srinivasan
This article has no evaluationsLatest version Jan 19, 2026

Discuss this preprint

Listed in

Abstract

Article activity feed

Related articles

Stimulus Equivalence in AI-Generated Anime Characters: Equivalence Responding Among Voices, Images, and Names

Perceptual adaptation and transfer of learning for noise‑vocoded cloned and human voices

Title: AI, Autism, and the Architecture of Voice: From Engineered Exclusion to Designed Dignity