Dynamic Assessment with AI (Agentic RAG) and Iterative Feedback: A Model for the Digital Transformation of Higher Education in the Global EdTech Ecosystem

Read the full article

Discuss this preprint

Start a discussion What are Sciety discussions?

Listed in

This article is not in any list yet, why not save it to one of your lists.
Log in to save this article

Abstract

This article formalizes AI-assisted assessment as a discrete-time policy-level design for iterative feedback and evaluates it in a digitally transformed higher-education setting. We integrate an agentic retrieval-augmented generation (RAG) feedback engine—operationalized through planning (rubric-aligned task decomposition), tool use beyond retrieval (tests, static/dynamic analyzers, rubric checker), and self-critique (checklist-based verification)—into a six-iteration dynamic evaluation cycle. Learning trajectories are modeled with three complementary formulations: (i) an interpretable update rule with explicit parameters η and λ that links next-step gains to feedback quality and the gap-to-target and yields iteration-complexity and stability conditions; (ii) a logistic-convergence model capturing diminishing returns near ceiling; and (iii) a relative-gain regression quantifying the marginal effect of feedback quality on the fraction of the gap closed per iteration. In a Concurrent Programming course (n=35), the cohort mean increased from 58.4 to 91.2 (0–100), while dispersion decreased from 9.7 to 5.8 across six iterations; a Greenhouse–Geisser corrected repeated-measures ANOVA indicated significant within-student change. Parameter estimates show that higher-quality, evidence-grounded feedback is associated with larger next-step gains and faster convergence. Beyond performance, we engage the broader pedagogical question of what to value and how to assess in AI-rich settings: we elevate process and provenance—planning artifacts, tool-usage traces, test outcomes, and evidence citations—to first-class assessment signals, and outline defensible formats (trace-based walkthroughs and oral/code defenses) that our controller can instrument. We position this as a design model for feedback policy, complementary to state-estimation approaches such as knowledge tracing. We discuss implications for instrumentation, equity-aware metrics, reproducibility, and epistemically aligned rubrics. Limitations include the observational, single-course design; future work should test causal variants (e.g., stepped-wedge trials) and cross-domain generalization.

Article activity feed