iRAT: Replanning and Controlled Retrieval for Robust LLM Reasoning

Zeeshan Ali
Praneeth Vadlapati
Aryan Singh

Read the full article

Listed in

This article is not in any list yet, why not save it to one of your lists.

Abstract

Large Language Models (LLMs) have demonstrated significant capabilities in answering questions using techniques such as Chain of Thought (CoT) and Retrieval-Augmented Generation (RAG). CoT enables step-by-step reasoning to improve accuracy, while RAG supplements LLMs with relevant external information. Retrieval-Augmented Thoughts (RAT) combines CoT and RAG to provide a more robust factual foundation and coherence in reasoning chains. However, RAT is limited in its ability to handle uncertainty and lacks replanning, often resulting in unnecessary retrievals, inefficiencies, and globally inconsistent reasoning. To address these limitations, we introduce iRAT, a novel reasoning framework that enhances RAT through retrieval control and replanning. iRAT dynamically evaluates uncertainty in initial responses, employs controlled and filtered retrievals to obtain only the most relevant context, revises thoughts to align with new content, and uses replanning to correct previous thoughts. Evaluations demonstrated that iRAT outperforms RAT in HumanEval, MBPP, and GSM8K datasets, while reducing retrievals by a considerable amount. The source code is available at github.com/prane-eth/iRAT. The fine-tuned model used for replanning is available at huggingface.co/zeeshan5k/iRATReasoningChainEvaluatorv2.

Version published to 10.20944/preprints202507.1289.v2
Jul 22, 2025
Version published to 10.20944/preprints202507.1289.v1
Jul 16, 2025

iRAT: Replanning and Controlled Retrieval for Robust LLM Reasoning

This article has 3 authors:
1. Zeeshan Ali
2. Praneeth Vadlapati
3. Aryan Singh
This article has no evaluationsLatest version Jul 22, 2025
Toward Efficient and Faithful Reasoning in Large Language Models

This article has 3 authors:
1. Lukas Schneider
2. Anna Muller
3. Mareike Gerhardt
This article has no evaluationsLatest version Jul 18, 2025
Learning to Retrieve, Generate, and Compress: A Unified View of Efficient RAG

This article has 4 authors:
1. Faruq Brontes
2. Jeanie Genesis
3. Zachariah Noa
4. Sigiwardaz Nymphodoros
This article has no evaluationsLatest version Aug 18, 2025

Listed in

Abstract

Article activity feed

Related articles

iRAT: Replanning and Controlled Retrieval for Robust LLM Reasoning

Toward Efficient and Faithful Reasoning in Large Language Models

Learning to Retrieve, Generate, and Compress: A Unified View of Efficient RAG