g-distance: On the comparison of model and human heterogeneity

Read the full article

Listed in

This article is not in any list yet, why not save it to one of your lists.
Log in to save this article

Abstract

Models are often evaluated when their behavior is at its closest to a single, sometimes averaged, set of empirical results, but this evaluation neglects the fact that both model and human behavior can be heterogeneous. Here, we develop a measure, g-distance, which considers model adequacy as the extent to which models exhibit a similar range of behaviors to the humans they model. We define g as the combination of two easilyinterpretable dimensions of model adequacy: accommodation and excess flexibility. We apply this measure to five models of an irrational learning effect, the inverse baserate effect (IBRE). g-distance identifies two models, a neural network with rapid attentional shifts (NNRAS) and a dissimilarity-similarity generalized context model (DGCM18), that outperform the previously most supported model (EXIT). We show that this conclusion holds for a wide range of beliefs about the relative importance of excess flexibility and accommodation. We further show that a pre-existing metric, the Bayesian Information Criterion (BIC), misidentifies a known-poor model of the IBRE as the most adequate model. Along the way, we discover that some of the models accommodate human behavior in ways that seem unintuitive from an informal understanding of their operation, thus underlining the importance of formal expression of theories. We discuss the implications of our findings for model evaluation generally, and for models of the inverse base-rate effect in particular, and end by suggesting future avenues of research in computational modeling.

Article activity feed