Architectural Transparency in LLM-Based Cognitive Assessment: A Multidimensional TRACE-ED Evaluation of Single-Agent and Multi-Agent Systems

Dani Chandra Yudho Pranoto
Suhailah Binti Hussien
Sabariah Sabariah
Adi Bandono
Ahmad Bahrawi

Read the full article

Discuss this preprint

Start a discussion What are Sciety discussions?

Listed in

This article is not in any list yet, why not save it to one of your lists.

Abstract

Purpose: The rapid integration of Large Language Models (LLMs) into educational assessment systems has intensified the need for rigorous transparency validation beyond accuracy and score agreement. While prior research primarily evaluates automated grading systems through reliability and human–AI alignment metrics, limited attention has been given to how architectural design influences transparency properties. This study proposes TRACE-ED, a multidimensional transparency framework that operationalizes transparency across five dimensions: Reliability (R), Alignment (A), Claim–Evidence Grounding (C), Explanation Coherence (E), and Disclosure/Auditability (D). Transparency is conceptualized as an architectural vector rather than a scalar attribute. Methods: Using a Design Science Research methodology and a Monte Carlo experimental design, this study compares single-agent and multi-agent LLM architectures across 10 synthetic short-answer responses, three rubric indicators, multiple temperature settings, and 15 repetitions per condition (total runs = 900). Reliability is evaluated using Intraclass Correlation Coefficient ICC(1,k), while grounding and coherence are measured through semantic similarity thresholds and polarity alignment metrics. Statistical comparisons employ Welch’s t-test and Cohen’s d effect sizes. Results: Results demonstrate that architectural modularization preserves high reliability (Multi-Agent ICC(1,k) = 0.9921) while significantly increasing semantic grounding (GR = 0.763; d = 3.72) and explanation coherence (CS = 0.782; d = 7.90). A minor increase in contradiction rate (CR = 0.031; d = 0.43) indicates limited coordination trade-offs. These findings confirm that transparency is multidimensional and architecture-sensitive. Multi-agent systems redistribute transparency components rather than uniformly maximizing them. Conclusion: This study advances explainable AI research in education by formalizing transparency as a measurable architectural vector, introducing Monte Carlo–based stochastic validation for LLM assessment systems, and providing empirical evidence of transparency trade-offs in modular AI design. The TRACE-ED framework offers a scalable and auditable methodology for evaluating AI-driven grading systems in high-stakes educational contexts.

Version published to 10.21203/rs.3.rs-8985839/v1 on Research Square
Mar 6, 2026

Transparency in Agentic AI: A Survey of Interpretability, Explainability, and Governance

This article has 5 authors:
1. Shaina Raza
2. Ahmed Radwan
3. Sindhuja Chaduvula
4. Mahshid Alinoori
5. Christos Emmanouilidis
This article has no evaluationsLatest version Feb 9, 2026
Perceive, Plan, Act, Self-Correct: An Architectural Framework for Goal-Directed Agentic AI Systems

This article has 1 author:
1. Hameem Mahdi
This article has no evaluationsLatest version Apr 2, 2026
From Gatekeeper to Architect: Operationalizing AI as a Cognitive Partner in Higher Education

This article has 2 authors:
1. Carlos Ugrinowitsch
2. Cleiton Augusto Libardi
This article has no evaluationsLatest version Feb 12, 2026

Discuss this preprint

Listed in

Abstract

Article activity feed

Related articles

Transparency in Agentic AI: A Survey of Interpretability, Explainability, and Governance

Perceive, Plan, Act, Self-Correct: An Architectural Framework for Goal-Directed Agentic AI Systems

From Gatekeeper to Architect: Operationalizing AI as a Cognitive Partner in Higher Education