Reward Modeling of Goal-directed Gaze Control

Read the full article See related articles

Discuss this preprint

Start a discussion What are Sciety discussions?

Listed in

This article is not in any list yet, why not save it to one of your lists.
Log in to save this article

Abstract

Goal-directed visual search in natural scenes is a complex behavior that requires the flexible integration of vision, memory, and contextual knowledge. Here we introduce a reward-based framework that unifies these processes by learning target-specific reward functions directly from human fixation data using inverse reinforcement learning. Trained on the large-scale COCO-Search18 dataset, our reward map model recovered dynamic reward structures that explain inhibitory tagging, adaptive stopping in target-absent trials, and contextual guidance from non-target objects—phenomena that cannot be easily explained by prior models focused on target guidance. And while state-of-the-art transformer-based computer vision models can mimic human-like scanpaths, their priority maps remain largely static across fixations and therefore fail to capture the temporal dynamics of search. By contrast, reward maps strike a balance between predictive performance and mechanistic interpretability, offering a framework for generating testable hypotheses about human search behavior.

Article activity feed