Distinct representation of navigational action affordances in human behavior, brains and deep neural networks

Read the full article See related articles

Listed in

This article is not in any list yet, why not save it to one of your lists.
Log in to save this article

Abstract

Humans can use a variety of actions to navigate their immediate environment. To decide how to move, the brain must determine which navigational actions are afforded by the current environment. Here, we demonstrate that human visual cortex represents navigational action affordances of complex natural scenes. Behavioral annotations of possible navigational actions (e.g., walking, cycling) show that humans group environments into distinct affordance clusters using at least three separate dimensions. Representational similarity analysis of multi-voxel fMRI responses in scene-selective visual cortex regions shows that perceived affordances are represented in a manner that is only partly explained by other scene properties (e.g. contained objects), and independent of the task performed in the scanner. Visual features extracted from deep neural networks (DNNs) pretrained on a range of other visual understanding tasks fail to fully account for behavioral and neural representations of affordances. While DNN predictions of human-perceived affordances improve when training directly on affordance labels, the best human-model alignment is observed when visual DNN features are paired with rich linguistic representation in a multi-modal large-language model. These results uncover a new type of representation in the human brain that reflect action affordances independent from other scene properties.

Article activity feed