A Benchmark for Math Misconceptions: Bridging Gaps in Middle School Algebra with AI-Supported Instruction

Nancy Otero
Stefania Druga
Andrew Lan

Read the full article

Listed in

This article is not in any list yet, why not save it to one of your lists.

Abstract

This study introduces an evaluation benchmark for middle school algebra tobe used in artificial intelligence(AI) based educational platforms. The goal is tosupport the design of AI systems that can enhance learners’ conceptual under-standing of algebra by taking into account learners’ current level of algebracomprehension. The dataset comprises of 55 algebra misconceptions, commonerrors and 220 diagnostic examples identified in prior peer-reviewed studies. Weprovide an example application using GPT-4, observing a range of precisionand recall scores depending on the topic and experimental setup reaching 83.9%when including educators’ feedback and restricting it by topic. We found thattopics such as ratios and proportions prove as difficult for GTP-4 as they arefor students. We included a human assessment of GPT-4 results and feedbackfrom five middle school math educators on the clarity and occurrence of the mis-conceptions in the dataset and the potential use of AI in conjunction with thedataset. Most educators (80% or more) indicated that they encounter these mis-conceptions among their students, suggesting the dataset’s relevance to teachingmiddle school algebra. Despite varied familiarity with AI tools, four out of fiveeducators expressed interest in using the dataset with AI to diagnose students’misconceptions or train teachers. The results emphasize the importance of topic-constrained testing, the need for multimodal approaches, and the relevance ofhuman expertise in gaining practical insights when using AI for human learning.

Version published to 10.21203/rs.3.rs-5306778/v1 on Research Square
Nov 29, 2024

Dovetailing Case-Based Reasoning and Large Language Models to Compare Teaching Strategies: A Mnemonic Augmentation Framework using the EEDI Dataset

This article has 2 authors:
1. Dietmar Janetzko
2. Horacio González-Vélez
This article has no evaluationsLatest version Jul 3, 2025
Teaching in the Age of AI: Exploring Faculty Perceptions and Practices

This article has 3 authors:
1. Katelyn Nelson
2. David T. Marshall
3. Kailea Q. Manning
This article has no evaluationsLatest version Jun 6, 2025
A psychological network analysis to specify predictions of fraction subtopics on algebra subtopics in an intelligent tutoring system

This article has 4 authors:
1. Markus Spitzer
2. Lisa Bardach
3. Younes Strittmatter
4. Korbinian Moeller
This article has no evaluationsLatest version Jul 7, 2025

Listed in

Abstract

Article activity feed

Related articles

Dovetailing Case-Based Reasoning and Large Language Models to Compare Teaching Strategies: A Mnemonic Augmentation Framework using the EEDI Dataset

Teaching in the Age of AI: Exploring Faculty Perceptions and Practices

A psychological network analysis to specify predictions of fraction subtopics on algebra subtopics in an intelligent tutoring system