Controlled Agentic AI Systems: A Governance-Driven Architecture for Auditable and Reproducible Decision Pipelines
Discuss this preprint
Start a discussion What are Sciety discussions?Listed in
This article is not in any list yet, why not save it to one of your lists.Abstract
Artificial intelligence systems deployed in safety-critical and regulated environments require guarantees of constraint compliance, auditability, and reproducibility. However, contemporary AI architectures typically treat governance and regulatory constraints as external or post hoc mechanisms, which limits their ability to ensure consistent and reliable execution. This paper introduces Controlled Agentic AI Systems (CAIS), a formal architectural framework in which governance is embedded directly into the decision pipeline as a deterministic operator. A CAIS is defined as a system integrating a decision model Mθ, a constraint set C, and a governance operator Gthat maps proposed decisions into an admissible action space. Formalization of the decision transformation at=G(Mθ(xt),C,st), introduce audit trace semantics, and define replayability conditions that enable reproducible execution. Theoretical analysis establishes that governance projection guarantees constraint satisfaction while inducing bounded decision drift under perturbations. Implementation of a reference framework and conduct controlled experiments in a multi-agent simulation environment under reproducible conditions. Results show that governance significantly reduces constraint violations, with approval-based gating achieving near-complete compliance and projection-based repair providing consistent mitigation. Crucially, these improvements are obtained without destabilizing system dynamics, and with bounded intervention cost. The experiments further reveal a structured trade-off between safety and decision drift across governance mechanisms. In federated settings, governance does not degrade convergence and instead stabilizes executed actions across training rounds, reducing action variance and eliminating constraint violations on benchmark states. These findings indicate that governance effectively decouples parameter-space variability from behavior-space outcomes. The proposed CAIS framework establishes governance as a fundamental architectural component of AI systems, providing a unified and experimentally validated approach to designing safe, auditable, and reproducible agentic intelligence.