AI Chatbots in Medical Education: Exploring Performance, Utility, and Learner Perceptions – A Mixed Methods Study

Read the full article See related articles

Listed in

This article is not in any list yet, why not save it to one of your lists.
Log in to save this article

Abstract

Background: While AI chatbots are gaining popularity in medical education, their pedagogical impact remains under-evaluated. This study examined the impact of a domain-specific chatbot on learning performance, perception, and cognitive engagement among medical students. Methods: Twenty first-year medical students completed two academic tasks using a custom-built educational chatbot (Lenny AI by qVault.ai) or conventional study methods in a randomised, crossover design. Learning was assessed through Single Best Answer (SBA) questions over two tasks with corresponding post-task perception surveys using Likert scales, and separate focus group discussions. Statistical analyses were performed to compare performance scores with perception measures, while qualitative data underwent thematic analysis with independent coding (κ = 0.403–0.633). Results: Participants rated the chatbot significantly higher than conventional resources in ease of use, satisfaction, engagement, perceived quality, and ease of understanding. (p < 0.05). Improvements in perceived efficiency and confidence were observed with mixed patterns. Lenny AI use did not result in significant performance gains; however, it was positively correlated with perceived efficiency, confidence in applying information, and perceived quality of information. Thematic analysis revealed accelerated factual retrieval but limited critical thinking and schema integration. Students expressed high functional trust but raised concerns about transparency. The chatbot was seen as a tool for rapid fact-checking, favouring learners who were goal-directed. Conclusion: AI chatbots can substantially enhance ease of use, satisfaction, and knowledge access in medical education. However, their capacity to foster deep learning remains limited. Future designs must prioritise adaptive scaffolding, traceable sourcing, and support for critical engagement to achieve sustained educational value.

Article activity feed