BuildArena: A Physics‑Aligned Interactive Benchmark of LLMs for Engineering Construction

Read the full article See related articles

Discuss this preprint

Start a discussion What are Sciety discussions?

Listed in

This article is not in any list yet, why not save it to one of your lists.
Log in to save this article

Abstract

Engineering construction automation aims to transform natural language specifi- cations into physically viable structures, requiring complex integrated reasoning under strict physical constraints. While modern LLMs possess broad knowledge and strong reasoning capabilities that make them promising candidates for this do- main, their construction competencies remain largely unevaluated. To address this gap, we introduce BuildArena, the first physics-aligned interactive benchmark designed for language-driven engineering construction. It contributes to the com- munity in four aspects: (1) a highly customizable benchmarking framework for in-depth comparison and analysis of LLMs; (2) an extendable task design strat- egy spanning static and dynamic mechanics across multiple difficulty tiers; (3) a 3D Spatial Geometric Computation Library for supporting construction based on language instructions; (4) a baseline LLM agentic workflow that effectively evalu- ates diverse model capabilities. On eight frontier LLMs, BuildArena compre- hensively evaluates their capabilities for language-driven and physics-grounded construction automation. The project page is at build-arena.github.io

Article activity feed