Accountable Deployment of Agentic AI Demands Layered, System-Level Interpretability

Judy Zhu*
Dhari Gandhi*
Ahmad Rezaie Mianroodi
Dhanesh Ramachandram
Sedef Akinli Kocak
Shaina Raza

Read the full article

Discuss this preprint

Start a discussion What are Sciety discussions?

Listed in

This article is not in any list yet, why not save it to one of your lists.

Abstract

Agentic AI systems behave through trajectories: they plan, invoke tools, update memory, and coordinate over multiple steps. However, interpretability remains largely model-centric, focused on explaining single predictions rather than tracing long-horizon behavior and responsibility across interacting components. As a result, critical failures, such as tool misuse, coordination breakdowns, or goal drift, often evade existing audits until harm occurs. We argue that interpretability for agentic systems must become system-centric, addressing trajectories, responsibility assignment, and lifecycle dynamics rather than internal model mechanisms alone. We advance three claims: interpretability must (1) co-evolve with agentic capabilities, (2) address distinct layers of opacity with tailored methods, and (3) integrate across the deployment lifecycle. To operationalize this position, we introduce ATLIS (Agentic Trajectory and Layered Interpretability Stack) , a framework integrating five interpretability layers across a five-stage deployment lifecycle. ATLIS enables lightweight continuous monitoring with risk-aware escalation to deeper system-level analysis when incidents are detected. ATLIS provides a blueprint for closing the growing gap between agentic capabilities and the interpretability infrastructure needed to govern them.

Version published to 10.21203/rs.3.rs-8802251/v1 on Research Square
Feb 9, 2026

Agent Harness for Large Language Model Agents: A Survey

This article has 11 authors:
1. Qianyu Meng
2. Yanan Wang
3. Liyi Chen
4. Yihang Li
5. Wei Wu
6. Wenyuan Jiang
7. Qimeng Wang
8. Chengqiang Lu
9. Yan Gao
10. Yi Wu
11. Yao Hu
This article has no evaluationsLatest version Apr 28, 2026
Agent Harness for Large Language Model Agents: A Survey

This article has 11 authors:
1. Qianyu Meng
2. Yanan Wang
3. Liyi Chen
4. Yihang Li
5. Wei Wu
6. Wenyuan Jiang
7. Qimeng Wang
8. Chengqiang Lu
9. Yan Gao
10. Yi Wu
11. Yao Hu
This article has no evaluationsLatest version Apr 28, 2026
Emergent Coordination in Multi-Agent Systems via Pressure Fields and Temporal Decay

This article has 1 author:
1. Rolando Rene Rodriguez
This article has no evaluationsLatest version Apr 8, 2026

Discuss this preprint

Listed in

Abstract

Article activity feed

Related articles

Agent Harness for Large Language Model Agents: A Survey

Agent Harness for Large Language Model Agents: A Survey

Emergent Coordination in Multi-Agent Systems via Pressure Fields and Temporal Decay