AI Tutors in Higher Education: Comparing Expectations to Evidence
Listed in
This article is not in any list yet, why not save it to one of your lists.Abstract
Given the rapid advancements, and notable failures, in large language model generative AI (genAI), there are elevated expectations that retrieval-augmented generation (RAG) AI tutors will revolutionize higher education by offering individualized, always-available tutoring based on validated content. However, experimental evidence on their effectiveness remains scarce. Using a randomized controlled field experiment, this study examines the effects of a genAI tutor on key precursors to learning success (i.e., interest, self-efficacy, and engagement) and academic achievement for about 450 undergraduate students across two modalities (in-person and asynchronous online). We completed a semester-long controlled experiment with pre- and post-treatment surveys and tests. Despite expectations, we found the genAI tutor had no statistically significant impact on any measured outcome. These early results challenge assumptions about AI’s instructional effectiveness and suggest universities should further investigate the pedagogical value of AI tutors before making substantial investments or committing to long-term contracts. We recommend future research to increase the generalizability of the findings and to discover methods to improve efficacy of AI tutors in higher education.