MetaThink: Empowering Large Reasoning Models with Adaptive Self-Correction at Inference Time

Jiarui Qi
Haoyu Bian

Read the full article

Discuss this preprint

Start a discussion What are Sciety discussions?

Listed in

This article is not in any list yet, why not save it to one of your lists.

Abstract

Large Reasoning Models (LRMs) face a fundamental challenge in balancing efficient "fast thinking" with accurate "slow thinking," often struggling to adaptively trigger deeper reasoning without incurring significant computational overhead. This paper introduces \( \textit{MetaThink (MT)} \), a novel inference-time adaptive refinement framework designed to imbue LRMs with conditional self-correction capabilities, without requiring any additional training. \( \textit{MetaThink} \) operates by an initial "fast thinking" phase, followed by a lightweight self-monitoring mechanism that assesses confidence through uncertainty markers. When low confidence or potential errors are detected, a refinement token triggers a targeted "slow thinking" phase, guided by domain-specific prompts. This allows the model to introspectively review and correct its reasoning, culminating in a more accurate final answer. Our comprehensive evaluation across diverse and challenging benchmarks—spanning mathematical reasoning, code generation, and scientific problem-solving tasks—demonstrates that \( \textit{MetaThink} \) consistently achieves substantial and robust improvements in Pass@1 accuracy. Crucially, these gains are realized while maintaining competitive or even improved inference efficiency, outperforming existing inference-time baselines. Our findings underscore that \( \textit{MetaThink} \) offers an effective, training-free approach to enhance the reliability and accuracy of LRMs in complex reasoning tasks by striking a superior balance between performance and efficiency.

Version published to 10.20944/preprints202603.0750.v1
Mar 10, 2026

Meta-Thinking in LLMs via Multi-Agent Reinforcement Learning: A Survey

This article has 5 authors:
1. Ahsan Bilal
2. Muhammad Ahmed Mohsin
3. Muhammad Umer
4. Muhammad Awais Khan Bangash
5. Muhammad Ali Jamshed
This article has no evaluationsLatest version Mar 5, 2026
Self-Aware Language Models: A Taxonomy and Evaluation of Epistemic Uncertainty and Hallucination Mitigation

This article has 2 authors:
1. Anjikya Tiwari
2. Vibhuti Gupta
This article has no evaluationsLatest version Jan 27, 2026
"Make it Pop, but not Like That": A Taxonomy of Iterative Prompting Strategies for Refining AI-Generated Web Interfaces

This article has 1 author:
1. Zhenjiang Song
This article has no evaluationsLatest version Mar 8, 2026

Discuss this preprint

Listed in

Abstract

Article activity feed

Related articles

Meta-Thinking in LLMs via Multi-Agent Reinforcement Learning: A Survey

Self-Aware Language Models: A Taxonomy and Evaluation of Epistemic Uncertainty and Hallucination Mitigation

"Make it Pop, but not Like That": A Taxonomy of Iterative Prompting Strategies for Refining AI-Generated Web Interfaces