BELIEFS: A Hierarchical Theory of Mind Model based on Strategy Inference
Listed in
This article is not in any list yet, why not save it to one of your lists.Abstract
Theory of Mind (ToM) refers to the capacity to infer others' latent mental states, such as intentions, beliefs, and strategies, and use these inferences to predict behavior. A defining characteristic of ToM is its recursive nature: individuals reason not only about what others are thinking, but also about what others think about them. Most computational models of ToM adopt a hierarchical structure in which Level-0 (L0) agents are assumed to follow simple, fixed heuristics (e.g., Win-Stay-Lose-Shift, WSLS) without mentalizing. However, this assumption overlooks the diversity of non-mentalizing strategies exhibited in human behavior, such as imitation or tit-for-tat, which do not conform to WSLS yet require no recursive reasoning. To address this limitation, we introduce a novel ToM framework (BELIEFS) that flexibly infers latent L0 strategies from behavior rather than relying on predefined heuristics. We evaluated the model in four classic dyadic games: Matching Pennies, Prisoner's Dilemma, Bach or Stravinsky, and Stag Hunt, manipulating model's learning rates and the volatility of L0 strategy switching. Predictive accuracy was assessed using cumulative negative log-likelihood (NLL) of opponent's next choice and compared against both a ToM model that assumes only WSLS at L0 and chance-level performance. Our model outperformed both baselines, particularly under low-volatility conditions and at intermediate learning rate. Moreover, to evaluate strategy inference, we computed trial-wise confusion matrices and Cohen's k; between inferred and true L0 strategies, reaching significantly above-chance classification. We further tested the model's ability to distinguish between action sequences generated by the opponent's true Theory of Mind (ToM) level (L0 vs. L1) and those generated using an incorrect ToM level. The model assigned lower negative log-likelihoods (NLLs) to sequences from the true level, suggesting an indirect method for identifying the opponent's actual ToM level. Finally, we assessed whether the model effectively tracks behaviorally distinguishable action probabilities across ToM levels. Using Fisher-transformed correlations between model-generated action probabilities at L0, L1, and L2, we found significant dissimilarities, especially in competitive games. In summary, our model introduces a flexible, probabilistic approach to Theory of Mind that captures both surface-level strategy use and recursive reasoning depth. By jointly tracking dynamic beliefs over L0 strategies and ToM levels, the model adapts to behavioral shifts and outperforms static heuristics. These advances provide a powerful framework for modeling human behavior in interactive contexts, with implications for both human-human and human-machine interaction research.