Large Language Model Agents: A Comprehensive Survey on Architectures, Capabilities, and Applications

Read the full article See related articles

Discuss this preprint

Start a discussion What are Sciety discussions?

Listed in

This article is not in any list yet, why not save it to one of your lists.
Log in to save this article

Abstract

Large Language Model (LLM) agents represent a paradigm shift in artificial intelligence, combining the remarkable reasoning capabilities of foundation models with the ability to perceive environments, make decisions, and take actions autonomously. This comprehensive survey provides an in-depth examination of LLM-based agents across multiple dimensions. We first establish a formal definition of LLM agents and trace their evolution from early language models to today's sophisticated autonomous systems. We then present a novel taxonomy that organizes the field into four fundamental categories: reasoning-enhanced agents that leverage chain-of-thought and tree-structured deliberation; tool-augmented agents that extend LLM capabilities through external APIs and knowledge bases; multi-agent systems that enable collaborative problem-solving through inter-agent communication; and memory-augmented agents that maintain persistent context across interactions. For each category, we analyze representative architectures, discuss key innovations, and evaluate their relative strengths and limitations. We further examine diverse applications spanning software engineering, scientific research, embodied robotics, and web automation, supported by systematic comparisons on established benchmarks including SWE-bench, WebArena, and AgentBench. Our analysis reveals that while current agents achieve impressive performance on structured tasks, significant challenges remain in areas such as long-horizon planning, hallucination mitigation, and safe deployment. We conclude by identifying promising research directions, including neuro-symbolic integration, multi-modal perception, and human-agent collaboration frameworks, providing a roadmap for advancing this rapidly evolving field.

Article activity feed