Dynamic presentation in 3D modulates face similarity judgments – A human-aligned encoding model approach
Listed in
This article is not in any list yet, why not save it to one of your lists.Abstract
Face perception dynamically unfolds in three-dimensional space, yet, experimental paradigms predominantly rely on static 2D images, limiting insights into real-world face processing. We conducted a pre-registered study comparing face similarity judgments in static 2D and dynamic 3D conditions using a triplet odd-one-out task in 2,605 participants (yielding data from 323,400 unique trials). Behavioral similarity matrices revealed a strong cross-condition correlation (R2D~3D = 0.93, p < 0.001), suggesting perceptual invariance, that is, consistency across modalities. However, human-aligned sparse (VICE) and deep (VGG-Face) encoding models trained to map face stimuli to behavioral judgments uncovered condition-specific weighting of facial geometry: while chin-cheek distance, eye size, and nose shape dominated similarity judgements in both conditions, face-width-height ratio and upper face length gained more perceptual relevance in 3D. Importantly, the richer information in 3D stimuli significantly reduced choice variance, indicating lower perceptual demand than in 2D during similarity judgements. Employing a representational alignment framework, our approach reveals both shared cognitive processing and representational differences between static 2D and dynamic 3D faces, motivating more naturalistic experimental paradigms which reflect real-world perception. Our open large-scale dataset and encoding models enable further advances in face perception research across biological and computational systems.