Prompt Injection Attacks in Large Language Models and AI Agent Systems: A Comprehensive Review of Vulnerabilities, Attack Vectors, and Defense Mechanisms
Discuss this preprint
Start a discussion What are Sciety discussions?Listed in
This article is not in any list yet, why not save it to one of your lists.Abstract
Large Language Models (LLMs) have rapidly transformed artificial intelligence applica-tions across industries, yet their integration into production systems has unveiled critical security vulnerabilities, chief among them prompt injection attacks. This comprehensive review synthesizes research from 2023-2025, analyzing over 120 peer-reviewed papers, industry security reports, and documented real-world exploits. We examine the taxonomy of prompt injection techniques, including direct jailbreaking and indirect injection through external content. The rise of AI agent systems and the Model Context Protocol (MCP) has dramatically expanded attack surfaces, introducing vulnerabilities such as tool poisoning and credential theft. We document critical incidents including GitHub Copilot's CVE-2025-53773 remote code execution vulnerability (CVSS 9.6) and ChatGPT's Windows license key exposure. Research demonstrates that just five carefully crafted documents can manipulate AI responses 90% of the time through Retrieval-Augmented Generation (RAG) poisoning. This review provides actionable mitigation strategies based on OWASP Top 10 for LLM Applications 2025, identifies fundamental limitations including the stochastic nature problem and alignment paradox, and proposes research directions for architectur-ally secure AI systems. Our analysis reveals that prompt injection represents a funda-mental architectural vulnerability requiring defense-in-depth approaches rather than singular solutions.