Benefit or Bottleneck? Assessing the Impact of Structured Reflection on Learning from AI-Driven Explanatory Feedback

Michael W. Asher
Gillian Gold
Paulo F. Carvalho

Read the full article

Discuss this preprint

Start a discussion What are Sciety discussions?

Listed in

This article is not in any list yet, why not save it to one of your lists.

Abstract

As AI tutors become increasingly capable of delivering rich, personalized feedback at scale, a key challenge remains: novice learners often struggle to process detailed explanations on their own. Structured reflection, grounded in decades of self-explanation research, is a theoretically compelling solution. By helping learners parse feedback and prompting them to actively interpret it, reflection activities are designed to reduce cognitive overload and deepen understanding. But does adding reflection to already rich AI-generated feedback actually help, or does it simply add friction? We tested this in a randomized experiment comparing Python practice with AI-generated, personalized feedback to "reflective practice," which paired identical feedback with structured self-explanation prompts. Contrary to our predictions, reflection never improved performance on any measure. Instead, it proved to be a temporal bottleneck: it doubled time spent on feedback and reduced practice volume by 40%, without making each learning opportunity more effective. Learners who cycled through more practice-and-feedback iterations outperformed reflective learners at the end of the session and maintained a small, nonsignificant advantage on transfer. Notably, reflection did not provide the scaffolding benefit we predicted for novices—and when individual differences did emerge, they favored higher-volume practice for more knowledgeable learners. Both practice conditions also substantially outperformed a high-quality video baseline (d = 0.66-0.93), replicating benefits of active practice with AI feedback over passive instruction. These findings suggest that when AI feedback is already elaborated and personalized, self-explanation activities may be a redundant time sink. As AI-generated feedback reaches learners at scale, these findings underscore the necessity of empirically validating pedagogical scaffolds—even those with strong theoretical support—before deploying them broadly.

Version published to 10.31234/osf.io/p3m2k_v1 on OSF Preprints
Apr 1, 2026

Adaptivity Makes Feedback Effective: Evidence From AI-Generated Feedback on Children’s Plans

This article has 4 authors:
1. Somphop Sukjaitham
2. Mirijam Schaaf
3. Garvin Brod
4. Jasmin Breitwieser
This article has no evaluationsLatest version Mar 15, 2026
ARPG+: Teaching Students to Ask Effective Questions for Educational LLM Use

This article has 6 authors:
1. Pei-Gen Ye
2. Kanghua Mo
3. Yucheng Long
4. Mengyun Liu
5. Haiwei Sang
6. Jun Zheng
This article has no evaluationsLatest version Apr 15, 2026
Generative AI feedback loops drive cognitive engagement and equity via the digital Hawthorne effect

This article has 1 author:
1. QIN YUFENG
This article has no evaluationsLatest version Mar 27, 2026

Discuss this preprint

Listed in

Abstract

Article activity feed

Related articles

Adaptivity Makes Feedback Effective: Evidence From AI-Generated Feedback on Children’s Plans

ARPG+: Teaching Students to Ask Effective Questions for Educational LLM Use

Generative AI feedback loops drive cognitive engagement and equity via the digital Hawthorne effect