Improving ToM Capabilities of LLMs in Applied Domains

Read the full article See related articles

Listed in

This article is not in any list yet, why not save it to one of your lists.
Log in to save this article

Abstract

Large language models (LLMs) have demonstrated impressive capabilities across a variety of tasks, but they often fall short in areas requiring nuanced theory of mind (ToM) reasoning. These limitations, including weak state tracking and an inability to infer mental states, hinder their application in domains demanding robust cognitive reasoning. In this work, we address these challenges by presenting three key contributions. First, we propose a novel formalism for modeling mental worlds that effectively mitigates the reasoning gaps observed in LLMs, providing a structured framework to support more robust inference. Second, we algorithmically enhance LLM performance in a specific ToM subdomain by combining fine-tuning with targeted simulations, enabling models to overcome domain-specific reasoning limitations. Finally, we develop a comprehensive methodology to generate complex, high-quality training data tailored to improve ToM reasoning in LLMs. This approach addresses the scarcity of suitable datasets by synthesizing scenarios that explicitly require ToM capabilities, facilitating more effective model training. Together, these contributions represent a significant step toward equipping LLMs with stronger cognitive reasoning abilities and advancing their applicability in socially and cognitively complex tasks.

Article activity feed