Contribution of statistical learning to mitigating the curse of dimensionality in reinforcement learning

Read the full article See related articles

Listed in

This article is not in any list yet, why not save it to one of your lists.
Log in to save this article

Abstract

Natural environments are abundant with patterns and regularities. It has been demonstrated that learning these regularities, especially through statistical learning, can greatly influence perception, memory, and other cognitive functions. Using a novel experimental paradigm involving two orthogonal tasks, we investigated whether regularities in the environment can enhance reward learning. In one task, human participants predicted the next stimulus in a sequence by recognizing regularities in a feature. In a separate, multidimensional learning task, they learned the predictive value of a different set of stimuli, based on reward feedback received after choosing between pairs of stimuli. Using both model-free and model-based approaches, we found that participants used regularities about features from the sequence-prediction task to bias their behavior in the learning task, resulting in the values associated with the regular feature having a greater influence. Fitting of choice behavior revealed that these effects were more consistent with attentional modulations of learning, rather than decision making, due to regularity manipulation. Specifically, the learning rates for the feature with regularity were higher, especially when learning from the forgone option during unrewarded trials. This demonstrates that feature regularities can intensify the confirmation bias observed in reward learning. Our results suggest that by enhancing learning about certain features, detecting regularities in the environment can reduce dimensionality and thus mitigate the curse of dimensionality in reward learning. Such interactions between statistical and reward learning have important implications for learning in naturalistic settings.

Significance statement

Natural environments are filled with patterns and regularities that we are adept at detecting. Upon detecting these regularities, our attentional system becomes engaged, significantly influencing other cognitive functions. But could these processes enhance reward learning in high-dimensional environments where feedback is limited? We aim to answer this question using a novel experimental paradigm combined with computational approaches. We found that detecting regularities in stimulus features can enhance learning by increasing the learning rates, especially for the forgone option when unrewarded. These findings suggest that learning about regularities of the environment can refine feature-based learning to mitigate the curse of dimensionality.

Article activity feed