Does the Use of Generative AI Undermine Learning: A Randomized Controlled Trial

Koichi Yano
Tae Okada
Masakazu Hojo
Mikito Masuda
Tomohiro Inoue
Shin Fukuda
Hirofumi Yamamura
Kanji Muramatsu

Read the full article

Listed in

This article is not in any list yet, why not save it to one of your lists.

Abstract

Purpose: This study quantitatively evaluates the impact of using a large language model on student learning outcomes.Design: A 1:1 parallel group randomized controlled trial.Setting: Classroom experiments conducted at Komazawa University, Tokyo, Japan.Intervention: The study comprised two experiments. In both, participants in the intervention group were permitted to use a large language model (Google Bard or Google Gemini) during a learning activity on a specific topic, while those in the control group were prohibited from using LLMs. Fourteen days later, all participants completed tasks on the learned topic without access to LLMs. Experiment 1, focused on "Essay Writing," was conducted in December 2023. Experiment 2, focused on "Reference Creation," was conducted in July 2024.Participants: Experiment 1 involved 50 undergraduate students and Experiment 2 involved 65 undergraduate students.Main Outcome Measures: Primary outcomes were the scores on tasks completed during the initial learning phase and 14 days post-learning. Secondary outcomes included time spent on task input.Results:Essay Writing (Experiment 1): Fourteen days post-learning, on an essay writing task completed without LLM access, the intervention group's mean score was 6.50, and the control group's mean score was 6.23. The mean difference was -0.27 (95% confidence interval (CI) -1.6 to 1.1). The effect size, as measured by Cohen’s d, is d=0.053, indicating a very small effect. These results do not support the conclusion that LLM use hinders learning in this context.Reference Creation (Experiment 2): Fourteen days post-learning, on a reference creation task completed without LLM access, the intervention group's mean score was 5.98, and the control group's mean score was 7.54. The mean difference was 1.56 (95% CI 0.98 to 2.14). The effect size, as measured by Cohen’s d, is d=-1.301, indicating a large effect. These results suggest that LLM use hindered learning in this context.Conclusions: Experiment 1 indicates that LLMs do not impede learning, whereas Experiment 2 reveals that LLMs hinder learning. The difference in results may stem from the absence of strict rules in essay writing (Experiment 1) compared to the presence of such rules in reference generation (Experiment 2). When strict rules are present, increasing the burden of task execution, participants permitted to use LLMs (Experiment 1) may have engaged in cognitive offloading. While acknowledging several limitations, this study employs a rigorous randomized controlled trial (RCT) methodology to investigate the impact of generative AI use on learning, specifically examining effects 14 days post-intervention, thereby offering significant implications for higher education.Pre-registration: https://osf.io/xgzwdProject page: https://osf.io/jsa5n/

Version published to 10.31219/osf.io/w3fr9_v1 on OSF Preprints
Apr 1, 2025

AI Tutors in Higher Education: Comparing Expectations to Evidence

This article has 2 authors:
1. Andrew Thoeni
2. Luke K. Fryer
This article has no evaluationsLatest version Apr 7, 2025
Theory Precedes Practice: Simulation-Based Learning Enhances Long-Term Recall, but Prior Text-Based Learning Enhances Its Benefits

This article has 2 authors:
1. Angélique Lebert
2. Oscar Vilarroya
This article has no evaluationsLatest version Apr 15, 2025
AI-Assisted Writing Feedback for Enhancing Secondary Students’ Writing Skills: An Experimental Study

This article has 2 authors:
1. Mehmet EKİZOĞLU
2. Ayşe Nesil DEMİR
This article has no evaluationsLatest version Apr 18, 2025

Listed in

Abstract

Article activity feed

Related articles

AI Tutors in Higher Education: Comparing Expectations to Evidence

Theory Precedes Practice: Simulation-Based Learning Enhances Long-Term Recall, but Prior Text-Based Learning Enhances Its Benefits

AI-Assisted Writing Feedback for Enhancing Secondary Students’ Writing Skills: An Experimental Study