PlanCompiler: A Deterministic Compilation Architecture for Structured Multi-Step LLM Pipelines

Pranav Harikumar

Read the full article

Discuss this preprint

Start a discussion What are Sciety discussions?

Listed in

This article is not in any list yet, why not save it to one of your lists.

Abstract

Large language models (LLMs) are brittle in multi-step structured workflows, where errors compound across sequential transformations, validation stages, and stateful operations such as SQL persistence. We present PlanCompiler, a deterministic compilation architecture for structured LLM pipelines that separates planning from execution through typed node registries, static graph validation, and topological compilation. Rather than relying on autoregressive chaining at runtime, the system executes a prevalidated workflow graph with explicit parameter and schema constraints. We evaluate the approach on 300 tasks across six benchmark sets spanning increasing workflow depth, SQL roundtrip persistence, and schema-drift stress tests. On depth-stratified benchmarks, the compiled system achieves 100% accuracy on Sets A and B, 88% on Set C, and 96% on Set D, outperforming direct code-generation baselines from GPT-4.1 and Claude Sonnet. On schema-trap tasks, it achieves 44/50, compared with 20/50 for GPT-4.1 and 26/50 for Claude. Across the full suite, the compiler uses approximately $0.356 in total inference cost, versus $2.140 for GPT-4.1 and $18.391 for Claude. These results suggest that, in this benchmark setting, deterministic compilation improves first-pass reliability and cost efficiency for structured multi-step LLM pipelines, while shifting remaining failures toward the planner–semantics boundary rather than execution-time structural errors.

Version published to 10.21203/rs.3.rs-9186427/v1 on Research Square
Mar 24, 2026

Grammar-Guided Incremental Method for Efficient LLM-Generated Code Execution

This article has 2 authors:
1. Anton Svystunov
2. Yaroslav Tereshchenko
This article has no evaluationsLatest version Apr 2, 2026
ConsultChain: Progressive Context Distillation Across Heterogeneous LLM Fleets for Token-Optimal Inference

This article has 1 author:
1. Samuel Edusa
This article has no evaluationsLatest version Apr 13, 2026
Merging LoRA Adapters for Multi-Task Code Analysis: An Empirical Study of Linear Combination and Task Interference

This article has 2 authors:
1. Sankalp Pathak
2. Sanjay Garg
This article has no evaluationsLatest version Apr 16, 2026

Discuss this preprint

Listed in

Abstract

Article activity feed

Related articles

Grammar-Guided Incremental Method for Efficient LLM-Generated Code Execution

ConsultChain: Progressive Context Distillation Across Heterogeneous LLM Fleets for Token-Optimal Inference

Merging LoRA Adapters for Multi-Task Code Analysis: An Empirical Study of Linear Combination and Task Interference