Flexible behavior or flexible methods? A cross-taxon review of experimental designs in reversal learning
Listed in
This article is not in any list yet, why not save it to one of your lists.Abstract
Behavioral flexibility—the ability to adapt behavior in response to changing conditions—is widely recognized as a key feature of animal cognition. It is often measured using reversal learning tasks, where individuals must inhibit a previously rewarded response and adopt a new one after contingencies shift. Despite its widespread use, the comparability of these tasks across species remains unclear. We conducted a systematic review of 206 empirical studies (2014–2023) spanning eight major taxonomic groups: invertebrates, fishes, amphibians and reptiles, birds, rodents, other mammals, non-human primates, and humans. For each study, we extracted variables related to taxon coverage, sampling, learning and reversal criteria, cue types, and outcome measures. Analyses included nonparametric tests to assess group-level differences, linear discriminant analyses to explore multivariate structure, and model-based robustness checks. Our findings reveal three fundamental obstacles to reliable cross-species inference. First, research effort is highly imbalanced: birds, rodents, and humans accounted for over half of all study cells, while most animal diversity—especially invertebrates and amphibians and reptiles—remains virtually untested, with less than 1% of described species included per taxon. Second, research is taxonomically siloed: 99% of studies focus on a single group, limiting opportunities for direct comparison. Third, and most critically, methodological standards diverge dramatically across taxa. Humans were consistently held to the strictest learning criteria (median threshold 90%), while birds, invertebrates, and fishes most often used lower thresholds (80–84%). Overtraining was implemented in two-thirds of amphibian and reptile studies but was rare (less than 30%) elsewhere. The number of reversal phases differed more than threefold among groups. Nearly all studies of amphibians, reptiles, fishes, and invertebrates used single-reversal designs, whereas multi-reversal protocols were much more common in humans and non-human primates. Sample sizes—both per cell and per study—, evaluation window lengths, cue types, and outcome metrics also displayed taxon-specific patterns. These systematic differences in experimental design introduce structural asymmetries that complicate cross-taxon comparisons, blurring the line between true cognitive variation and methodological artifacts. Although research to date has advanced our understanding, further progress will depend on greater methodological coordination and broader taxonomic coverage. Emerging large-scale collaborations are beginning to address these gaps, offering a promising path toward a more robust and equitable science of behavioral flexibility.