Toward Responsible AI in High-Stakes Domains: A Dataset for Building Static Analysis with LLMs in Structural Engineering

Carlos Fabian Avila Vega
Daniel Jefferson lbay Yupa
Paola Vanessa Tapia Tapia
Edgar David Rivera Tapia

Read the full article

Listed in

This article is not in any list yet, why not save it to one of your lists.

Abstract

Modern engineering faces an unprecedented paradox: while our systems grow in-creasingly complex, the tools we use to design and evaluate them must remain both reliable and transparent. Decisions in energy, infrastructure, and construction no longer occur in isolation but within socio-technical networks shaped by emerging technologies and artificial intelligence (AI). Among these advances, large language models (LLMs) such as GPT have attracted attention for their ability to synthesize solutions, interpret domain-specific queries, and generate outputs with minimal fine-tuning. Yet beneath this promise lies a critical flaw—LLMs do not compute; they predict. Their reliance on sta-tistical associations often leads to biases, logical missteps, or hallucinated values, short-comings that become especially problematic when applied to structural engineering, where safety and compliance are non-negotiable. This tension sets the stage for the present work. The dataset introduced here responds to this gap by demonstrating how generative AI can be grounded within validated com-putational workflows. Through the Model Context Protocol (MCP), ChatGPT was con-nected to numerical solvers such as OpenSees and benchmarked against ETABS, ensuring traceability, reproducibility, and compliance with seismic design standards. The dataset comprises technical prompts, GPT outputs, verified numerical analyses, and comparative error metrics for four reinforced concrete frame models designed under Ecuadorian (NEC-15) and U.S. (ASCE 7-22) standards. Beyond a simple record, it exemplifies a re-producible methodology for embedding LLMs within structural engineering practice. By curating and releasing this dataset, the study pursues three goals: to strengthen re-producibility by enabling independent verification, to foster interdisciplinary collabo-ration across AI, civil engineering, and data science, and to establish benchmarks for context-aware AI integration in high-stakes domains. In doing so, it not only illustrates the promise of human–AI teaming but also highlights the limitations that must be addressed if generative models are to be responsibly embedded in engineering decision-making.

Version published to 10.20944/preprints202509.0212.v1
Sep 2, 2025

The Silent Onset of an AI-Scored Society — How Conclusions Without Process Quietly Reallocate Social Visibility

This article has 1 author:
1. Kawazoe Tsutomu
This article has no evaluationsLatest version Sep 24, 2025
Probing Hidden States for Calibrated, Alignment-Resistant Predictions in LLMs

This article has 10 authors:
1. Jacob S. Berkowitz
2. Sophia Kivelson
3. Apoorva S. Srinivasan
4. Undina Gisladottir
5. Kevin Tsang
6. Jose Miguel Acitores Cortina
7. Aditi Kuchi
8. Jake R. Patock
9. Ryan Czarny
10. Nicholas P Tatonetti
This article has no evaluationsLatest version Sep 19, 2025
Uncensored AI in the Wild: Tracking Publicly Available and Locally Deployable LLMs

This article has 1 author:
1. Bahrad A. Sokhansanj
This article has no evaluationsLatest version Sep 16, 2025

Listed in

Abstract

Article activity feed

Related articles

The Silent Onset of an AI-Scored Society — How Conclusions Without Process Quietly Reallocate Social Visibility

Probing Hidden States for Calibrated, Alignment-Resistant Predictions in LLMs

Uncensored AI in the Wild: Tracking Publicly Available and Locally Deployable LLMs