Predictive learning enables compositional representations

Read the full article See related articles

Discuss this preprint

Start a discussion What are Sciety discussions?

Listed in

This article is not in any list yet, why not save it to one of your lists.
Log in to save this article

Abstract

The brain builds predictive models to plan future actions. These models generalize remarkably well to new environments, but it is unclear how neural circuits acquire this flexibility. Compositional representations, which have been observed in the brain, could explain this adaptability. They enable a process called compositional generalization, where independent modules performing different computations can be selected to perform novel composite tasks. In this work, we show that compositional representations emerge in recurrent neural networks (RNNs) trained solely to predict future sensory inputs. We trained an RNN to predict frames in a visual environment defined by independent latent factors and their corresponding dynamics. We found that the network learned to solve this task by developing a compositional model. Specifically, it had disentangled representations of the latent factors, and formed distinct, modular clusters, each implementing a single dynamic. The network autonomously selected which cluster to use according to the sensory inputs, without task labels, using competitive dynamics between clusters. This modular and disentangled architecture enabled the network to perform compositional generalization, accurately predicting outcomes in novel contexts composed of unseen combinations of dynamics. Our findings explain how an unsupervised mechanism can learn the modular causal structure of an environment in a compositional code.

Significance

The brain can function in environments it has never been in before, an ability called generalization. In our study, we show that when a neural network model is trained to predict future observations, it will develop a modular structure that enables the brain to do one type of generalization, called compositional generalization, where known dynamics can be recomposed in new ways. Indeed, the network develops clusters that each perform a computation representing a dynamic. Importantly, it can recombine these clusters according to the input it is receiving, with no need for other clues. We also show that the network has distinct clusters of neurons coding for each latent of the environment, consistent with what has been observed experimentally.

Article activity feed