Meta-Thinking in LLMs via Multi-Agent Reinforcement Learning: A Survey

Read the full article See related articles

Discuss this preprint

Start a discussion What are Sciety discussions?

Listed in

This article is not in any list yet, why not save it to one of your lists.
Log in to save this article

Abstract

This survey explores the development of meta thinking capabilities in Large Language Models (LLMs) from a Multi Agent Reinforcement Learning (MARL) perspective. Meta thinking refers to self reflection, self assessment, and regulation of internal reasoning processes, and represents a crucial step toward improving LLM reliability, adaptability, and performance, particularly in complex or high stakes settings. The survey begins by examining current limitations of LLMs, including hallucinations and the absence of robust internal self evaluation mechanisms. It then reviews contemporary approaches such as Reinforcement Learning from Human Feedback (RLHF), self distillation, and Chain of Thought (CoT) prompting, highlighting both their contributions and limitations. The core focus of the survey is on multi agent architectures such as supervisor agent hierarchies, debate based systems, and theory of mind frameworks that emulate human like introspection and enhance robustness. By analyzing reward design, self play dynamics, and continuous learning strategies within MARL, the survey presents a structured roadmap for developing introspective, adaptive, and trustworthy LLM systems. It also discusses evaluation metrics, benchmark datasets, and future research directions, including neuroscience inspired designs and hybrid symbolic neural reasoning frameworks.

Article activity feed