Shared sensitivity to data distribution during learning in humans and transformer networks

Jacques Pesnot Lerousseau
Christopher Summerfield

Read the full article

Discuss this preprint

Start a discussion What are Sciety discussions?

Listed in

This article is not in any list yet, why not save it to one of your lists.

Abstract

Do humans learn like transformers? We trained both humans ( n = 530) and transformer networks on a rule learning task where they had to respond to a query in a sequence. At test, we measured ‘in-context’ learning (generalize the rule to novel queries) and ‘in-weights’ learning (recall past experiences from memory). Manipulating the diversity and redundancy of examples in the training distribution, we found that humans and transformer networks respond in very similar ways. In both types of learner, redundancy and diversity trade off in driving in-weights and in-context learning, respectively, whereas a composite distribution with a balanced mix of redundancy and diversity allows the two strategies to be used in tandem. However, we also found that while humans benefit from dynamic training schedules that emphasize diverse examples early, transformers do not. So, while the same data-distributional properties promote learning in humans and transformer networks, only people benefit from curricula.

Version published to 10.1038/s41562-025-02359-3
Dec 23, 2025
Version published to 10.31234/osf.io/ryn8d on OSF Preprints
Oct 25, 2024

Supervision, Category Structure, and Selective Attention in Category Learning: A Comparative Approach

This article has 5 authors:
1. Hyungwook Yim
2. Leyre Castro
3. Jay I. Myung
4. Edward A. Wasserman
5. Vladimir Sloutsky
This article has no evaluationsLatest version Feb 11, 2026
Task-Conditioned Representation Adaptation for Many-Shot In-Context Learning via Continued Pretraining

This article has 3 authors:
1. Lukas Schneider
2. Anna-Maria Keller
3. Michael Tobias Fischer
This article has no evaluationsLatest version Feb 16, 2026
PruneBERT: Context-Aware Sentence Classification through Statistical Relevance Pruning

This article has 5 authors:
1. Raghav Kaushik R
2. Jeganathan L
3. Janaki Meena M
4. Ummity Srinivasa Rao
5. Jayaram Balabaskaran
This article has no evaluationsLatest version Feb 6, 2026

Discuss this preprint

Listed in

Abstract

Article activity feed

Related articles

Supervision, Category Structure, and Selective Attention in Category Learning: A Comparative Approach

Task-Conditioned Representation Adaptation for Many-Shot In-Context Learning via Continued Pretraining

PruneBERT: Context-Aware Sentence Classification through Statistical Relevance Pruning