PlanCompiler: A Deterministic Compilation Architecture for Structured Multi-Step LLM Pipelines
Discuss this preprint
Start a discussion What are Sciety discussions?Listed in
This article is not in any list yet, why not save it to one of your lists.Abstract
Large language models (LLMs) are brittle in multi-step structured workflows, where errors compound across sequential transformations, validation stages, and stateful operations such as SQL persistence. We present PlanCompiler, a deterministic compilation architecture for structured LLM pipelines that separates planning from execution through typed node registries, static graph validation, and topological compilation. Rather than relying on autoregressive chaining at runtime, the system executes a prevalidated workflow graph with explicit parameter and schema constraints. We evaluate the approach on 300 tasks across six benchmark sets spanning increasing workflow depth, SQL roundtrip persistence, and schema-drift stress tests. On depth-stratified benchmarks, the compiled system achieves 100% accuracy on Sets A and B, 88% on Set C, and 96% on Set D, outperforming direct code-generation baselines from GPT-4.1 and Claude Sonnet. On schema-trap tasks, it achieves 44/50, compared with 20/50 for GPT-4.1 and 26/50 for Claude. Across the full suite, the compiler uses approximately $0.356 in total inference cost, versus $2.140 for GPT-4.1 and $18.391 for Claude. These results suggest that, in this benchmark setting, deterministic compilation improves first-pass reliability and cost efficiency for structured multi-step LLM pipelines, while shifting remaining failures toward the planner–semantics boundary rather than execution-time structural errors.