Harness Engineering for Language Agents: The Harness Layer as Control, Agency, and Runtime
Discuss this preprint
Start a discussion What are Sciety discussions?Listed in
This article is not in any list yet, why not save it to one of your lists.Abstract
Language agents that act through tools, files, browsers, APIs, and persistent sessions are shaped by more than the base model or a single prompt. Their reliability depends on a harness layer that determines which instructions remain authoritative, what actions are available, how state is carried forward, and how failures are handled over time. This position paper argues that recent practice has made this layer visible enough to warrant explicit treatment in NLP. We propose and operationalize a working decomposition of the harness layer as control, agency, and runtime (CAR); situate harness engineering in the arc from software engineering through prompt and context engineering; and provide a lightweight audit of 40 harness-relevant works in our selected evidence base, suggesting a visibility gap between academic papers and public engineering notes. We further argue that many reported agent gains may be partly harness-sensitive rather than purely model-driven, and propose HarnessCard as a lightweight reporting artifact, including a filled example. Grounded in papers, benchmarks, protocols, and engineering notes through 20th Mar, 2026, we argue that progress in language agents should report not only the model, but also the harness layer that turns capability into governed action.