Bayesian reinforcement learning models reveal how great-tailed grackles improve their behavioral flexibility in serial reversal learning experiments

Abstract

Environments can change suddenly and unpredictably and animals might benefit from being able to flexibly adapt their behavior through learning new associations. Serial (repeated) reversal learning experiments have long been used to investigate differences in behavioral flexibility among individuals and species. In these experiments, individuals initially learn that a reward is associated with a specific cue before the reward is reversed back and forth between cues, forcing individuals to reverse their learned associations. Cues are reliably associated with a reward, but the association between the reward and the cue frequently changes. Here, we apply and expand newly developed Bayesian reinforcement learning models to gain additional insights into how individuals might dynamically modulate their behavioral flexibility if they experience serial reversals. We derive mathematical predictions that, during serial reversal learning experiments, individuals will gain the most rewards if they 1) increase their *rate of updating associations* between cues and the reward to quickly change to a new option after a reversal, and 2) decrease their *sensitivity* to their learned association to explore the alternative option after a reversal. We reanalyzed reversal learning data from 19 wild-caught great-tailed grackles ( Quiscalus mexicanus) , eight of whom participated in serial reversal learning experiment, and found that these predictions were supported. Their estimated association-updating rate was more than twice as high at the end of the serial reversal learning experiment than at the beginning, and their estimated sensitivities to their learned associations declined by about a third. The changes in behavioral flexibility that grackles showed in their experience of the serial reversals also influenced their behavior in a subsequent experiment, where individuals with more extreme rates or sensitivities solved more options on a multi-option puzzle box. Our findings offer new insights into how individuals react to uncertainty and changes in their environment, in particular, showing how they can modulate their behavioral flexibility in response to their past experiences.

Behavioral flexibility, i.e. the “ability to adapt behavior to new circumstances through packaging information and making it available to other cognitive processes” (Logan et al. 2023), appears as one of the crucial elements of responses of animal species to changing environments. Behavioral flexibility can change within the life of individuals, depending on their experience on the degree of variability and predictability of their surrounding environment. But little is known on the cognitive processes involved in these temporal changes in behavioral flexibility within individuals.

This is what Lukas et al. (2024) investigated very thoroughly, using the framework of serial reversal learning experiments on great-tailed grackles to study different aspects of the question. Behavioral flexibility as involved in serial reversal learning experiments was previously modeled as being made of two primary parameters: the rate of updating associations, phi (i.e. how fast individuals learn the associations between a cue and its associated reward or danger); and the sensitivity to the learned associations, lambda (i.e. how strong do individuals make their choices based on the associations they learned).

Lukas et al. (2024)* used a Bayesian reinforcement model to infer phi and lambda in individuals going through serial reversal learning experiments, to understand which of these two parameters explains most of the variation in grackle performance in serial reversal learning, how correlated they are, how they can change along time depending on an individual’s experience, how variable they can be among individuals, and whether they can predict performance in other contexts. But beforehand, the authors used an individual-based model to assess the ability of the Bayesian reinforcement model to correctly assess phi and lambda in their experimental design. They also used the Bayesian model to infer the range of values of phi and lambda an individual needs to exhibit to reduce errors in the serial reversal learning experiment.

Among other results, this study shows that in a context of rapidly changing but strongly reliable cues, the variation in the success of grackles is more associated with the rate of updating associations (phi) than the sensitivity to learned associations (lambda). Besides, phi increased within individuals along the serial reversal learning experiment, while lambda only slightly decreased. However, it is very interesting to note that different approaches could be adopted by different individuals through the training, leading them eventually to the same final performance: slightly different combinations of changes in lambda and phi lead to different behaviours but compensate each other in the end in the final success rate.
This study provides exciting insights into the cognitive processes involved in how changes in behavioral flexibility of individuals can happen in this type of serial learning experiments. But it also offers interesting openings to understand the mechanisms by which behavioral flexibility can change in the wild, helping individuals to cope with rapidly changing environments.

* Lukas et al. (2024) presents a post-study of the preregistered study Logan et al. (2019) that was peer-reviewed and received an In Principle Recommendation for PCI Ecology (Coulon 2019; the initial preregistration was split into 3 post-studies). A pre-registered study is a study in which context, aims, hypotheses and methodologies have been written down as an empirical paper, peer-reviewed and pre-accepted before research is undertaken. Pre-registrations are intended to reduce publication bias and reporting bias.

References

Coulon, A. (2019) Can context changes improve behavioral flexibility? Towards a better understanding of species adaptability to environmental changes. Peer Community in Ecology, 100019. https://doi.org/10.24072/pci.ecology.100019

Logan, CJ, Lukas D, Bergeron L, Folsom M, McCune, K. (2019). Is behavioral flexibility related to foraging and social behavior in a rapidly expanding species? In Principle Acceptance by PCI Ecology of the Version on 6 Aug 2019. http://corinalogan.com/Preregistrations/g_flexmanip.html

Dieter Lukas, Kelsey B. McCune, Aaron P. Blaisdell, Zoe Johnson-Ulrich, Maggie MacPherson, Benjamin M. Seitz, Augustus Sevchik, Corina J. Logan (2024) Bayesian reinforcement learning models reveal how great-tailed grackles improve their behavioral flexibility in serial reversal learning experiments. ecoevoRxiv, ver.4 peer-reviewed and recommended by Peer Community in Ecology https://doi.org/10.32942/osf.io/4ycps

Read the original source

Bayesian reinforcement learning models reveal how great-tailed grackles improve their behavioral flexibility in serial reversal learning experiments

This article has been Reviewed by the following groups

Discuss this preprint

Listed in

Abstract

Article activity feed