Goal Reduction with Loop-Removal Accelerates RL and Models Human Brain Activity in Goal-Directed Learning

Read the full article See related articles

Discuss this preprint

Start a discussion What are Sciety discussions?

Listed in

This article is not in any list yet, why not save it to one of your lists.
Log in to save this article

Abstract

Goal-directed planning presents a challenge for classical RL algorithms due to the vastness of the combinatorial state and goal spaces, while humans and animals adapt to complex environments, especially with diverse, non-stationary objectives, often employing intermediate goals for long-horizon tasks. Here, we propose a goal reduction mechanism for effectively deriving subgoals from arbitrary and distant original goals, using a novel loop-removal technique. 1 The product of the method, called goal-reducer , distills high-quality subgoals from a replay buffer, all without the need for prior global environmental knowledge. Simulations show that the goal-reducer can be integrated into RL frameworks like Deep Q-learning and Soft Actor-Critic. It accelerates performance in both discrete and continuous action space tasks, such as grid world navigation and robotic arm manipulation, relative to the corresponding standard RL models. Moreover, the goal-reducer , when combined with a local policy, without iterative training, outperforms its integrated deep RL counterparts in solving a navigation task. This goal reduction mechanism also models human problem-solving. Comparing the model’s performance and activation with human behavior and fMRI data in a treasure hunting task, we found matching representational patterns between a goal-reducer agent’s components and corresponding human brain areas, particularly the vmPFC and basal ganglia. The results suggest that humans may use a similar computational framework for goal-directed behaviors.

Article activity feed