Estimating the motor exploration in reinforcement learning

Read the full article See related articles

Listed in

This article is not in any list yet, why not save it to one of your lists.
Log in to save this article

Abstract

What motor exploration strategies do animals use to learn a skill from rewards? Reinforcement learning theory provides no guidance for estimating motor exploration — the behavioral component aimed at discovering better strategies. Inspired by the brain’s modular organization, we postulate a latent learner that explores via an additive source of ideal randomness it injects into behavior.

Assuming the learner is ignorant of other motor components, which is sub-optimal by design, evolutionary fitness argues that these should display mainly non-ideal variability.

We test this recipe for behavior decomposition in songbirds subjected to a vocal pitch conditioning task. The explorative component we estimate from vocalizations accounts for the motor contribution of a basal ganglia pathway and the other behavioral component accounts for birds’ suboptimal learning trajectories. This congruence between normative exploration and brain organization suggests that the evolutionary pressure for behavioral optimality is lesser than to learn from purely random trials.

Article activity feed