Understanding cortical computation through the lens of joint-embedding predictive architectures

Read the full article See related articles

Discuss this preprint

Start a discussion What are Sciety discussions?

Listed in

This article is not in any list yet, why not save it to one of your lists.
Log in to save this article

Abstract

Tracking prey or recognizing a lurking predator is as crucial for survival as anticipating their actions. To guide behavior, the brain must extract information about object identities and their dynamics from entangled sensory inputs. How it accomplishes this feat remains an open question. Predictive coding theories propose that this ability arises by comparing predicted sensory signals with actual inputs and reducing the associated prediction errors. However, existing models rely on generative architectures that compute prediction errors directly in the input space, which is hard to reconcile with neural anatomy. Here, we develop a theory that avoids this issue by operating solely on internal representations. Specifically, we introduce recurrent predictive learning (RPL), a recurrent joint-embedding predictive architecture inspired by self-supervised machine learning, that learns abstract representations of object identity and their dynamics and predicts future object motion. Crucially, the model learns sequence representations that resemble successor-like representations observed in the primary visual cortex of humans. The model also develops abstract sequence representations comparable to those reported in the macaque prefrontal cortex. Finally, we outline how RPL’s modular feedforward-recurrent organization could map onto canonical cortical microcircuits as a plausible model of cortical representation learning. Our work establishes a circuit-centric theory framework that unifies predictive processing with self-supervised learning, providing a fresh perspective on how the brain acquires an internal model of the world.

Article activity feed