Humans use a dual information-seeking policy to improve noisy inferences outside the explore-exploit tradeoff

Read the full article See related articles

Discuss this preprint

Start a discussion What are Sciety discussions?

Listed in

This article is not in any list yet, why not save it to one of your lists.
Log in to save this article

Abstract

Everyday decisions do not only aim to earn rewards but also to learn about the world. In a large study (420 participants, >380,000 decisions), we asked how people gather information stripped of rewarding value, and compared their strategy with reward seeking in otherwise matched conditions. We found that humans use a two-stage information-seeking strategy, where they begin by repeatedly sampling each novel option in turn (which we call “streaking”) before engaging in uncertainty-guided exploration. Computational modeling revealed that this streak-first, explore-second strategy aims at forming a first hypothesis about each novel option before minimizing global uncertainty, and improves the accuracy of the imprecise inferences that humans make about sampled options. Artificial neural networks trained to optimize inference accuracy could acquire uncertainty-guided exploration but not early streaking, highlighting a specific feature of human information seeking. Although streaking and uncertainty-guided exploration tended to be co-expressed in the same participants, these two stages were related to different psychological traits. Together, these results offer a novel account of human information seeking, its motives and benefits under imprecise cognitive computations, outside the explore-exploit tradeoff.

Article activity feed