Reward Modeling of Goal-directed Gaze Control
Discuss this preprint
Start a discussion What are Sciety discussions?Listed in
This article is not in any list yet, why not save it to one of your lists.Abstract
Goal-directed visual search in natural scenes is a complex behavior that requires the flexible integration of vision, memory, and contextual knowledge. Here we introduce a reward-based framework that unifies these processes by learning target-specific reward functions directly from human fixation data using inverse reinforcement learning. Trained on the large-scale COCO-Search18 dataset, our reward map model recovered dynamic reward structures that explain inhibitory tagging, adaptive stopping in target-absent trials, and contextual guidance from non-target objects—phenomena that cannot be easily explained by prior models focused on target guidance. And while state-of-the-art transformer-based computer vision models can mimic human-like scanpaths, their priority maps remain largely static across fixations and therefore fail to capture the temporal dynamics of search. By contrast, reward maps strike a balance between predictive performance and mechanistic interpretability, offering a framework for generating testable hypotheses about human search behavior.