The Rational Irrational: Better Learners Show Stronger Frequency Heuristics

Mianzhi Hu
Darrell A. Worthy

Read the full article

Discuss this preprint

Start a discussion What are Sciety discussions?

Listed in

This article is not in any list yet, why not save it to one of your lists.

Abstract

Does favoring less valuable options that deliver more frequent rewards reflect flawed decision-making or an adaptive strategy under complex environments? Frequency effects, defined as a bias toward more frequently rewarded but less valuable options, have traditionally been viewed as maladaptive decision-making deficits. In the present study, we used a within-subject design in which participants completed a four-option reinforcement learning task twice, once under a baseline condition and once with a reward frequency manipulation, to test whether better baseline learning predicts greater or lesser susceptibility to frequency-based biases. Participants were first trained on two fixed option pairs and then transferred their knowledge to novel pairings in a testing phase. Across conditions, higher training accuracy generally predicted higher test accuracy, with one critical exception: on trials where a more valuable option was pitted against a more frequently rewarded but less valuable alternative, participants with higher training accuracy exhibited a stronger bias toward the more frequent option. Moreover, baseline optimal choice rates in these specific trials were unrelated to—and even slightly negatively correlated with—optimal choice rates under the frequency condition. Computational modeling further showed that participants with better baseline learning performance were better fit by frequency-sensitive models in the frequency condition and they weighed frequency-based processing more heavily than value-based processing. Overall, these findings suggest that frequency effects, rather than signaling flawed learning, manifest more strongly in individuals with better baseline learning performance. This seemingly irrational bias may, under conditions of uncertainty, reflect a flexible, adaptive strategy that emerges among the best learners when value-based approaches are costly or unreliable.

Author Summary

In daily life, people often face choices between familiar, frequently encountered options and unfamiliar alternatives that may be more valuable. For example, we may keep visiting a local restaurant we know well instead of trying a new one with better reviews. This tendency, known as the frequency effect, reflects a bias toward options that yield more frequent rewards, even when those rewards are smaller and suboptimal overall. Traditionally, such behavior has been interpreted as a sign of neuropsychological impairments or flawed learning, while our study found the opposite. We asked 495 participants to complete a reinforcement learning task under two conditions: one with balanced reward frequencies and another in which one option was rewarded more frequently despite being less valuable than its alternative. Surprisingly, we found that better learners in the balanced condition were more likely to show frequency effects when reward frequencies were manipulated and uneven. Computational modeling confirmed that these individuals shifted from value-based strategies to frequency-based ones when the environment made value-based decisions more difficult. These findings suggest that frequency effects are not simply errors. Instead, they may represent an adaptive shortcut that emerges more strongly in better decision-makers as a flexible strategy for navigating uncertain environments when value-based calculations are costly or unreliable

Version published to 10.1101/2025.09.18.676999 on bioRxiv
Sep 18, 2025

Automatic value learning results in counterproductive human behavior

This article has 5 authors:
1. Ido Ben-Artzi
2. Maayan Pereg
3. Roy Luria
4. Rani Moran
5. Nitzan Shahar
This article has no evaluationsLatest version Sep 7, 2025
Automatic value learning results in counterproductive human behavior

This article has 5 authors:
1. Ido Ben-Artzi
2. Maayan Pereg
3. Roy Luria
4. Rani Moran
5. Nitzan Shahar
This article has no evaluationsLatest version Sep 7, 2025
Demand Avoidance in Value-Based Choice Under Risk: A Behavioral and Pupillometric Examination

This article has 2 authors:
1. Kevin da Silva Castanheira
2. A. Ross Otto
This article has no evaluationsLatest version Oct 7, 2025

Discuss this preprint

Listed in

Abstract

Author Summary

Article activity feed

Related articles

Automatic value learning results in counterproductive human behavior

Automatic value learning results in counterproductive human behavior

Demand Avoidance in Value-Based Choice Under Risk: A Behavioral and Pupillometric Examination