Condensed Reasoning Prompting: Efficient Strategies, Evaluations, and Trade Offs in Large Language Model Reasoning

Read the full article See related articles

Listed in

This article is not in any list yet, why not save it to one of your lists.
Log in to save this article

Abstract

Recent advancements in large language models (LLMs) have demonstrated that explicitly prompting for intermediate reasoning steps significantly improves performance in complex tasks. Traditional chain of thought (CoT) prompting, however, can result in verbose outputs that increase both latency and computational cost. Condensed Reasoning Prompting (CRP) addresses this trade-off by encouraging more concise reasoning traces while maintaining high accuracy. In this paper, we systematically evaluate three prompting strategies: Chain Of Thought (CoT), Chain of Draft (CoD) and Condensed Reasoning across multiple datasets, including MMLU, Big Bench, arithmetic, and symbolic reasoning tasks. We report accuracy, average tokens per question, and a token effectiveness metric (accuracy divided by token count). Our experiments are conducted in a zero-shot setting, without specific system instructions to "skip reasoning," providing a more realistic assessment of model capabilities. Results indicate that condensed prompts often match or exceed chain of thought accuracy while reducing token usage, thus offering significant gains in efficiency. We discuss the implications for real-world deployments, highlighting how CRP can enable more efficient LLM applications without compromising performance.

Article activity feed