TheoTeach: An LLM-Powered Socratic Tutoring System for Teaching Theoretical Computer Science Topics
Discuss this preprint
Start a discussion What are Sciety discussions?Listed in
This article is not in any list yet, why not save it to one of your lists.Abstract
Automata theory and formal language theory occupy afoundational but notoriously difficult position inundergraduate computer science curricula. Studentsfrequently struggle with the abstract formalism of finiteautomata , including the five-tuple definition, transitionfunction semantics, and state diagram construction.Whilelarge language model (LLM)-based tutoring systems offera promising new modality for individualized instruction,two well-documented failure modes limit theirpedagogical value: the tendency to reveal completesolutions rather than scaffolding student reasoning, andthe generation of pedagogically inappropriate declarativestatements in place of guiding questions (Ding et al.,2024). This paper presents TheoTeach, an LLM-poweredSocratic tutoring system designed specifically forteaching DFA concepts to students with no prior exposureto automata theory. TheoTeach implements a four-layervalidation pipeline that enforces Socratic dialogueconstraints and prevents solution leakage at inferencetime, a twenty-two-step vocabulary-gated curriculumderived from professor-authored instructional materials,per-student concept mastery tracking, and integration withthe JFLAP finite automaton modeling tool (Rodger &Finley, 2006). The system runs NousResearch/Meta-Llama-3.1-8B-Instruct with a TheoTeach LoRA adapter,served via vLLM on a GPU-equipped cloud server; anindependent validator instance using the same basemodel—without the LoRA adapter—evaluates and, whennecessary, rephrases each generated response beforedelivery to the student. We describe the systemarchitecture in sufficient detail to support replication andpropose a controlled between-subjects study in whichundergraduate students at East Carolina University will berandomly assigned to either a TheoTeach tutoringcondition or a traditional instructional materials control.Participants will complete a DFA knowledge assessmentimmediately after a structured instructional session andagain at a two-week retention interval. This paper reportsthe system design, instructional rationale, and plannedstudy protocol; empirical outcomes will be reported uponstudy completion.