Theory-Prompt-Validation: A Practice-Oriented Approach to Using LLMs for Verbal Coding in the Learning Sciences

Luisa Wellert
Alexander Braun
Sara Becker
Pauline Frick
Wieland Brendel
Andreas Lachner

Read the full article

Discuss this preprint

Start a discussion What are Sciety discussions?

Listed in

This article is not in any list yet, why not save it to one of your lists.

Abstract

Large Language Models (LLMs) hold growing potential for scaling the analysis of qualitativedata in the learning sciences. One essential step in qualitative analysis is coding verbal data.Although several approaches have been proposed for automating coding with LLMs, fewappear well-suited to the needs of the learning sciences, where coding requires identifyingdescriptive content categories and pedagogical functions of utterances, often implicit anddifficult to detect. To address this challenge, we provide the Theory–Prompt–Validation(TPV) approach, a three-step process comprising theory modeling, prompt engineering, andvalidation. This approach provides a blueprint for evidence-based application of LLMs forcontext-sensitive coding of verbal data. The TPV approach builds on existing methods whileincorporating requirements specific to the learning sciences. We emphasize a theory-driven,evidence-based process, including validation analyses to verify that pragmatic functions areaccurately captured in coding. To illustrate the TPV approach, we applied it to AI-studenttutoring dialogues. We developed a theory-driven codebook and implemented it in a Python-based script leveraging GPT-4o to segment utterances and assign codes automatically.Intercoder agreement between the LLM and a human coder was substantial (κ = .73, 95% CI[0.69,0.76]) and descriptively higher than between two human coders (κ = .69, 95% CI [0.66,0.73]). Validity analyses revealed theoretically meaningful patterns, such as positive effectswhen the code feedback was assigned more frequently. Our work demonstrates that LLM-based coding can reliably and validly scale the analysis of verbal data, bridging the gapbetween time-intensive qualitative methods and large-scale, evidence-based educationalresearch.

Version published to 10.31234/osf.io/zyje8_v1 on OSF Preprints
Dec 11, 2025

When and How Does LLM-Generated Feedback Surpass Traditional Automated Writing Evaluation? A Learning Trajectory Analysis of Writing Improvement

This article has 3 authors:
1. Da-Wei Zhang
2. Xinyu Hong
3. Yuying Qi
This article has no evaluationsLatest version Dec 18, 2025
Argumentative essay assessment with LLMs: A critical scoping review

This article has 5 authors:
1. Lucile Favero
2. Gabrielle Gaudeau
3. Juan Antonio Pérez-Ortiz
4. Tanja Käser
5. Nuria Oliver
This article has no evaluationsLatest version Feb 2, 2026
Bridging Psychometric and Content Development Practices with AI: A Community-Based Workflow for Augmenting Hawaiian Language Assessments

This article has 2 authors:
1. Frank Brockmann
2. Pōhai Kūkea Shultz
This article has no evaluationsLatest version Dec 15, 2025

Discuss this preprint

Listed in

Abstract

Article activity feed

Related articles

When and How Does LLM-Generated Feedback Surpass Traditional Automated Writing Evaluation? A Learning Trajectory Analysis of Writing Improvement

Argumentative essay assessment with LLMs: A critical scoping review

Bridging Psychometric and Content Development Practices with AI: A Community-Based Workflow for Augmenting Hawaiian Language Assessments