Towards Trustworthy and Effective AI for Academic Policy Navigation: Human Evaluation of a Source-Aware, Domain-Optimized RAG-Based Chatbot

Read the full article See related articles

Listed in

This article is not in any list yet, why not save it to one of your lists.
Log in to save this article

Abstract

Navigating institutional policies remains a challenge for students and staff due to complex legalistic language, hierarchical structures, and dispersed documentation. While Large Language Models (LLMs) such as GPT-4o offer fluent natural language capabilities, their susceptibility to hallucination limits their perceived trustworthiness in academic contexts where factual accuracy and traceability are critical. This study investigates how a combination of transparency-enhancing tactics—specifically, source citation and human-centered evaluation—and domain-specific performance strategies can support the development of more trustworthy and effective AI systems. We present a source-aware, Retrieval-Augmented Generation (RAG)-based chatbot designed to assist users in interpreting Bournemouth University’s Code of Practice for Research Degrees. The system integrates trust-building interventions with performance-enhancing techniques tailored to policy documents, including layout-aware chunking, hybrid self-reranking, and semantic vector search using Pinecone. Quantitative evaluation using the RAGAS framework and BERTScore shows a high faithfulness score (0.9597), outperforming baseline LLM responses. In a pilot user study with doctoral students, participants reported strong satisfaction with clarity (mean score: 3.60/4.0) and source attribution (92% accuracy). While not a complete solution for trustworthy AI, this work demonstrates how targeted design interventions—combining transparency and domain optimization—can enhance both trust and effectiveness in AI-assisted academic policy navigation.

Article activity feed