Composing predictable primitives for zero-shot learning
Discuss this preprint
Start a discussion What are Sciety discussions?Listed in
This article is not in any list yet, why not save it to one of your lists.Abstract
Animals exhibit a remarkable capacity to adapt to novel challenges on their first attempt. We propose that this zero-shot adaptability requires keeping behavioral outcomes highly predictable and controllable, which is achieved by structuring behavior as short compositions of simple primitives. Continuously across environments, the agent learns to predict the outcomes of primitive sequences in an unsupervised manner. When solving a new task, it optimizes short primitive compositions using differentiable model-predictive control. Constraining behavior to short compositions keeps gradient-based planning tractable, yielding efficiency orders of magnitude beyond traditional methods. Applied to mouse homing-escape behavior, our model explains the emergence of subgoals across arena configurations where conventional models fail. Ablations confirm that the components enabling rapid learning in our model (world models, planning models, primitives, and compositionality) are necessary to reproduce escape trajectories. Our results suggest that optimizing predictable primitive compositions is a core mechanism driving zero-shot adaptability and explaining mouse behavioral phenomena.