Semantic Bridges for Student Modeling: Leveraging LLM-Generated Narratives for Interpretable Representation Learning

Read the full article

Discuss this preprint

Start a discussion What are Sciety discussions?

Listed in

This article is not in any list yet, why not save it to one of your lists.
Log in to save this article

Abstract

Student modeling plays a central role in adaptive learning systems. However, current methods often require a trade-off between representational richness and interpretability. Traditional feature engineering produces interpretable but semantically shallow representations, while deep learning methods achieve rich embeddings at the cost of explainability. We propose a framework that utilizes Large Language Model (LLM)-generated narratives as semantic bridges between structured educational data and dense vector representations. Unlike representations learned from scratch, LLM-derived embeddings inherit semantic structure from pretraining on vast text corpora, encoding domain knowledge about education, academic performance, and student success factors. Our approach generates human-readable narratives from student data, then extracts embeddings that preserve both interpretability (through the narrative layer) and computational utility (through dense vectors). We present a comprehensive quality assessment framework evaluating narratives across five dimensions: content quality, linguistic diversity, coherence, uniqueness, and tone appropriateness. Empirical evaluation on 3,169 undergraduate students demonstrates strong content coverage and coherence, with excellent uniqueness.

Article activity feed