Should I stay or should I go? Generalized marginal value theorem with temporal discounting
Listed in
This article is not in any list yet, why not save it to one of your lists.Abstract
Consider a person in an environment containing patchy rewards. Under what circumstances should they stay in a given reward patch or leave that patch to seek out a new one? In a landmark 1976 study, Charnov derived the action policy that maximizes the expected rate at which this person gathers rewards. This result is called the Marginal Value Theorem (MVT), and decades of study have shown that humans’ and other animals’ behaviors qualitatively follow MVT, but with notable and systematic deviations. These deviations have been hypothesized to arise from the fact that MVT does not incorporate temporal discounting whereas humans and other animals tend to value current rewards more than future ones. Rigorously testing that hypothesis has been challenging because there is no mathematical theory that determines optimal patch foraging decision policies for agents who use temporal discounting. To fill this knowledge gap, I derived the optimal patch foraging policy for agents who exponentially discount future rewards and studied how that optimal policy depends upon their temporal discount rate, and upon the structure of their environment. Notably there are conditions under which the optimal policy with temporal discounting is to leave earlier than is predicted by MVT (under-staying), while under other conditions the optimal policy is to stay longer than is predicted by MVT (over-staying). The theory presented here delineates when each situation arises and may help to interpret the otherwise-puzzling ways in which human and animal behaviors deviate from MVT.