Leveraging Knowledge Profiles and Generative AI for Realistic Student Response Generation

Eylül Ipçi
Tanya Nazaretsky
Tanja Käser

Read the full article

Listed in

This article is not in any list yet, why not save it to one of your lists.

Abstract

The scarcity of high-quality labeled data often hinders the effective use of automated formative feedback in education. While analytic rubrics offer a reliable framework for automated grading, training robust models still requires hundreds of expert-labeled responses, an expensive and time-consuming process. This paper proposes a methodology for generating diverse, rubric-aligned synthetic student responses using large language models (LLMs) guided by knowledge profiles and representative examples. We introduce two profile-based generation strategies, straightforward and error-informed, and evaluate them compared to a dataset (N = 585) of authentic open-ended logical-proof responses from a Discrete Mathematics course. We analyze the diversity and realism of the generated datasets using embedding-based distance metrics and PCA and assess their utility for training automated grading models. Our results show that synthetic responses are less diverse than authentic ones, and models trained solely on generated data perform worse than those trained on real data. However, combining small authentic datasets with generated data significantly improves model performance, suggesting it is an effective augmentation strategy in low-resource educational settings.

Version published to 10.35542/osf.io/wt5ux_v1 on OSF Preprints
Jun 14, 2025

Dovetailing Case-Based Reasoning and Large Language Models to Compare Teaching Strategies: A Mnemonic Augmentation Framework using the EEDI Dataset

This article has 2 authors:
1. Dietmar Janetzko
2. Horacio González-Vélez
This article has no evaluationsLatest version Jul 3, 2025
MultiLLM – Self Reflect Iterative Prompt Methodology based Automated Essay Scoring System

This article has 2 authors:
1. R. Johnsi
2. G. Bharadwaja Kumar
This article has no evaluationsLatest version Jun 18, 2025
Is ChatGPT a good study companion? The role of AI-generated summaries and questions in learning from educational videos

This article has 3 authors:
1. Ayşe Candan Şimşek
2. Gerrit Anders
3. Markus Huff
This article has no evaluationsLatest version Jun 6, 2025

Listed in

Abstract

Article activity feed

Related articles

Dovetailing Case-Based Reasoning and Large Language Models to Compare Teaching Strategies: A Mnemonic Augmentation Framework using the EEDI Dataset

MultiLLM – Self Reflect Iterative Prompt Methodology based Automated Essay Scoring System

Is ChatGPT a good study companion? The role of AI-generated summaries and questions in learning from educational videos