Transparency in Agentic AI: A Survey of Interpretability, Explainability, and Governance

Shaina Raza
Ahmed Radwan
Sindhuja Chaduvula
Mahshid Alinoori
Christos Emmanouilidis

Read the full article

Discuss this preprint

Start a discussion What are Sciety discussions?

Listed in

This article is not in any list yet, why not save it to one of your lists.

Abstract

Agentic AI systems built on large language models plan, use tools, and maintain memory over multiple steps. Their risks and responsibilities depend on an execution trajectory rather than a single output. Despite progress, work on transparency for such systems remains scattered. Most explainability and interpretability research still targets static or single step model outputs, while Agentic AI surveys emphasize planning, tools, and memory, giving limited attention to transparency and oversight. Literature insufficiently addresses what should be subject to transparency and recorded during an agent’s lifecycle and how to verify records. Addressing this need, this survey offers a transparency-focused analysis by connecting interpretability, explainability, and governance for Agentic AI systems from design to deployment and synthesizing relevant methods for agent artifacts, including plans, tool interactions, memory events, and coordination signals, relating them to assurance needs such as faithfulness, auditability, compliance, robustness, and equity. The paper consolidates evaluation practices and highlights gaps, especially in trajectory level accountability, tool mediated provenance, and multi-agent coordination transparency. It proposes the Minimal Explanation Packet as standardized outcome artifact bundling key lifecycle evidence into an audit-ready record. The survey serves as a reference for researchers and practitioners to consistently compare approaches, design evaluations, and report transparency evidence.

Version published to 10.31224/6451
Feb 9, 2026

Agents Don't Need a Better Brain -- They Need a World

This article has 1 author:
1. Danilo Naranjo Emparanza
This article has no evaluationsLatest version Mar 23, 2026
Controlled Agentic AI Systems: A Governance-Driven Architecture for Auditable and Reproducible Decision Pipelines

This article has 1 author:
1. Tymoteusz Miller
This article has no evaluationsLatest version May 8, 2026
Harness Engineering for Language Agents: The Harness Layer as Control, Agency, and Runtime

This article has 6 authors:
1. Chaoyue He
2. Xin Zhou
3. Di Wang
4. Hong Xu
5. Wei Liu
6. Chunyan Miao
This article has no evaluationsLatest version Apr 23, 2026

Discuss this preprint

Listed in

Abstract

Article activity feed

Related articles

Agents Don't Need a Better Brain -- They Need a World

Controlled Agentic AI Systems: A Governance-Driven Architecture for Auditable and Reproducible Decision Pipelines

Harness Engineering for Language Agents: The Harness Layer as Control, Agency, and Runtime