The blessing and curse of Value-Shaping imitation

Isabelle Hoxha
Alice Milford Asseo
Stefano Palminteri

Read the full article

Discuss this preprint

Start a discussion What are Sciety discussions?

Listed in

This article is not in any list yet, why not save it to one of your lists.

Abstract

Empirical studies suggest that in the context of reinforcement learning, imitation is instantiated as a form of Value-Shaping process (VS), where the actions of another learner (the demonstrator) affect the value function of the learner. These studies also show that, consistently with evolutionary theory, imitation is controlled through a meta-learning process, where inaccurate demonstrators are progressively less imitated. In the present study we aimed at evaluating the normative status of Value-Shaping, in comparison to other forms of imitation and vicarious learning. To do so, we simulated increasingly complex scenarios and showed that, while in simple settings Value-Shaping imitation is generally good (and outperforms other forms of imitations), it is maladaptive in more complex settings. More specifically, when imitation is bidirectional and the environment unstable, VS leads to the worst possible performance. We speculate that it may be contributing to echo chambers and opinion polarization in real life. Finally, we introduce a new model, conditional Value-Shaping (cVS), which overcomes these difficulties.

Version published to 10.31234/osf.io/u89vr_v1 on OSF Preprints
Apr 10, 2026

Now or later: A reinforcement learning model of behavioural delay

This article has 4 authors:
1. Sahiti Chebolu
2. Peiyuan Zhang
3. Wei Ji Ma
4. Peter Dayan
This article has no evaluationsLatest version Apr 19, 2026
Biased processing of multiple outcomes in human reinforcement learning: evidence from computational modeling and eye-tracking

This article has 6 authors:
1. Henri Vandendriessche
2. Gruson Charlotte
3. Antonios Nasioulas
4. Camille Straboni
5. Maël Lebreton
6. Stefano Palminteri
This article has no evaluationsLatest version Apr 11, 2026
Not so Basic Instinct: Learning, Evolution, and the Behavioural Hypercycle Learning and Instinct Coevolution

This article has 1 author:
1. Douglas Roy
This article has no evaluationsLatest version Apr 6, 2026

Discuss this preprint

Listed in

Abstract

Article activity feed

Related articles

Now or later: A reinforcement learning model of behavioural delay

Biased processing of multiple outcomes in human reinforcement learning: evidence from computational modeling and eye-tracking

Not so Basic Instinct: Learning, Evolution, and the Behavioural Hypercycle Learning and Instinct Coevolution