Revealing the mechanisms underlying latent learning with successor representations
Listed in
This article is not in any list yet, why not save it to one of your lists.Abstract
Latent learning experiments were critical in shaping Tolman’s cognitive map theory. In a spatial navigation task, latent learning means that animals acquire knowledge of their environment through exploration, without explicit reinforcements. Animals pre-exposed in this way perform better on a subsequent learning task than those that were not. This learning enhancement is dependent on the specific design of the pre-exposure phase. Here, we investigate latent learning and its modulation using computational modeling based on the deep successor representation (DSR), a recent model for cognitive map formation. Our results confirm experimental observations in that exploration aligned with the future reward location significantly improves reward learning compared to random or no exploration. This effect generalizes across different action selection strategies. We show that these performance differences follow from the spatial information encoded in the structure of the DSR acquired in the pre-exposure phase. In summary, this study sheds light on the mechanisms underlying latent learning and how such learning shapes cognitive maps, impacting their effectiveness in goal-directed spatial tasks.
Author summary
Latent learning enables animals to construct cognitive maps of their environment without direct reinforcement. This process facilitates efficient navigation when rewards are introduced later, that is, animals familiar with a maze through prior exposure learn rewarded tasks faster than those without pre-exposure. Evidence suggests that the design of the pre-exposure phase significantly impacts the effectiveness of latent learning. Targeted pre-exposure focused on future reward locations enhances learning more than generic pre-exposure. However, the underlying mechanisms driving these differences remain understudied. This study investigates how pre-exposure methods influence subsequent navigation task performance using an artificial agent based on deep successor representations – a model for learning cognitive maps – within a reinforcement learning framework. Our findings reveal that before reward learning, agents receiving targeted pre-exposure develop spatial features more closely aligned with those of agents learning from rewards, compared to agents experiencing generic pre-exposure. This alignment enables the targeted pre-exposure agent to take more effective goal-oriented actions, resulting in accelerated initial learning. The persistence of this advantage, even when modifying the agent’s exploration policy, indicates a robust cognitive map within the successor representation.