The Effect of Reward Magnitude on Different Types of Exploration in Human Reinforcement Learning

Kanji Shimomura
Kenji Morita

Read the full article

Listed in

This article is not in any list yet, why not save it to one of your lists.

Abstract

How humans resolve the explore–exploit dilemma in complex environments is an important open question. Previous studies suggested that environmental richness may affect the degree of exploration in a type-specific manner and reduce random exploration while increasing uncertainty-based exploration. Our study examined this possibility by extending a recently developed two-armed bandit task that can dissociate the uncertainty and novelty of stimuli. To extract the pure effect of environmental richness, we manipulated the reward by its magnitude, not its probability, across blocks because reward probability affects outcome controllability. Participants ( N = 198) demonstrated increased optimal choices when the relative reward magnitude was higher. A behavioral analysis with computational modeling revealed that a higher reward magnitude reduced the degree of random exploration but had little effect on the degree of uncertainty- and novelty-based exploration. These results suggest that humans modulate their degree of random exploration depending on the relative level of environmental richness. Combined with findings from previous studies, our findings indicate the possibility that outcome controllability also influences the exploration–exploitation balance in human reinforcement learning.

Version published to 10.1007/s42113-024-00224-6
Oct 3, 2024
Version published to 10.21203/rs.3.rs-4627464/v1 on Research Square
Jul 16, 2024

Listed in

Abstract

Article activity feed