Semantic Bridges for Student Modeling: Leveraging LLM-Generated Narratives for Interpretable Representation Learning
Discuss this preprint
Start a discussion What are Sciety discussions?Listed in
This article is not in any list yet, why not save it to one of your lists.Abstract
Student modeling plays a central role in adaptive learning systems. However, current methods often require a trade-off between representational richness and interpretability. Traditional feature engineering produces interpretable but semantically shallow representations, while deep learning methods achieve rich embeddings at the cost of explainability. We propose a framework that utilizes Large Language Model (LLM)-generated narratives as semantic bridges between structured educational data and dense vector representations. Unlike representations learned from scratch, LLM-derived embeddings inherit semantic structure from pretraining on vast text corpora, encoding domain knowledge about education, academic performance, and student success factors. Our approach generates human-readable narratives from student data, then extracts embeddings that preserve both interpretability (through the narrative layer) and computational utility (through dense vectors). We present a comprehensive quality assessment framework evaluating narratives across five dimensions: content quality, linguistic diversity, coherence, uniqueness, and tone appropriateness. Empirical evaluation on 3,169 undergraduate students demonstrates strong content coverage and coherence, with excellent uniqueness.