Dynamic Adaptive Reasoning: Optimizing LLM Inference-Time Thinking via Intent-Aware Scheduling
Discuss this preprint
Start a discussion What are Sciety discussions?Listed in
This article is not in any list yet, why not save it to one of your lists.Abstract
Recent progress in complex reasoning tasks has revealed the potential of large language models driven by Chain-of-Thought (CoT) processes. However, their inference-time reasoning often suffers from inefficiency and limited adaptability. This work introduces the Intent-Aware Reasoning Scheduler (IARS), a framework that dynamically refines inference-time reasoning by enabling models to perceive and adjust their own reasoning intent. IARS integrates an independent Intent-Aware Scheduler (IAS) that continuously analyzes generated thought tokens, identifying reasoning states such as exploring, confirming, ambiguous, or near-answer. Based on these states, IAS issues adaptive directives to modulate reasoning depth and style. The approach requires no retraining, operating purely during inference through lightweight decoding and prompt adjustments. Experiments on diverse reasoning benchmarks show that IARS achieves higher reasoning quality with fewer tokens, while human evaluation and ablation studies confirm its interpretability and efficiency. The results demonstrate that intent-aware scheduling provides a more adaptive and effective mechanism for steering model reasoning.