PRIME: Prompt Refinement via Information-driven Methods and Expansion, A Modular Framework for Context-Aware Prompt Amplification
Discuss this preprint
Start a discussion What are Sciety discussions?Listed in
This article is not in any list yet, why not save it to one of your lists.Abstract
While Large Language Models (LLMs) have transformed natural language processing, their effectiveness depends critically on prompt quality. Current Retrieval-Augmented Generation (RAG) systems retrieve documents to generate answers; we propose a fundamentally different approach: using retrieval to construct better questions. This paper introduces PRIME (Prompt Refinement via Information-driven Methods and Expansion), a framework that treats prompt construction as a first-class optimization problem. PRIME implements a modular pipeline with heterogeneous document loaders (10+ formats), pluggable embedding strategies (sparse TF-IDF/BM25 and dense Sentence-BERT/Gemini-Embedding-004/OpenAI), persistent vector stores, and multi-provider LLM generators (GPT-4o, Claude-3, Gemini-2.5-Flash, Gemini-3-Flash-Preview). We formalize prompt amplification mathematically and introduce four novel evaluation metrics: structural coherence (S), semantic specificity (P), contextual completeness (C), and lexical readability (L). Comprehensive experiments across four domains, 12 embedding configurations, 6 LLM backends, and 30 human-evaluated prompts reveal: (1) dense embeddings achieve 37-73% higher retrieval precision; (2) Gemini-3-Flash-Preview achieves 165x expansion ratio with 0.798 quality score; (3) complex queries outperform simple ones by 92%; (4) human evaluators rate PRIME outputs 4.2/5 with 87% inter-rater agreement; and (5) caching provides 1,944x speedup. We present systematic failure analysis identifying when amplification degrades performance. PRIME is released as an open-source library (pip install prompt-amplifier).