What Does It Mean to Explain? A Functional Taxonomy Across Human and Machine Learning

Lewis Lewin

Read the full article

Discuss this preprint

Start a discussion What are Sciety discussions?

Listed in

This article is not in any list yet, why not save it to one of your lists.

Abstract

Forms of explanation are typically studied as cognitive events: researchers ask what explaining does to the explainer or the learner. This paper instead analyzes explanation as a family of instructional operations and asks what functionally distinct behaviors are grouped under the everyday verb "explain." Drawing on behavior-analytic concepts, the paper proposes a two-dimensional taxonomy organized by (a) the learner's existing repertoire and (b) the type of concept being taught. Explanation types include structured examples, demonstrations, rule statements, metaphors and analogies, causal chains, and four subtypes of corrective feedback (simple, feature-directed, rule-referencing, and error-magnitude signaling). Each is defined by its repertoire requirements and the concept types it can and cannot handle. The paper then maps these behavioral categories onto parallel constructs in cognitivism and machine learning, showing that many apparent theoretical disagreements reduce to terminological translation problems, while also identifying genuine gaps—most notably the absence of widely adopted, training-time counterparts to feature-directed and rule-referencing feedback in current neural network practice. These gaps are independently corroborated by a recent comprehensive survey of LLM reasoning failures which documents systematic breakdowns in LLM reasoning that align with the failure modes the present taxonomy predicts. The taxonomy generates testable predictions about when particular forms of explanation should be necessary, sufficient, or impossible, and it suggests new directions for research on explanatory operations in both human and machine learners.

Version published to 10.31235/osf.io/thrw8_v1 on OSF Preprints
Mar 25, 2026

Inductive biases and the geometry of concept learning

This article has 2 authors:
1. Matias Osta-Vélez
2. Peter Gärdenfors
This article has no evaluationsLatest version Apr 10, 2026
Rote Memorization or Intelligence: An Assessment of Inferential Reasoning in Large Language Models

This article has 3 authors:
1. Rashid Mehmood
2. Eid Rehman
3. Muhammad Habib
This article has no evaluationsLatest version Apr 1, 2026
Experiential Traces: A Framework for Empirical Investigation of Machine Cognition Through Reasoning Block Analysis

This article has 1 author:
1. Travis Gilly
This article has no evaluationsLatest version Mar 24, 2026

Discuss this preprint

Listed in

Abstract

Article activity feed

Related articles

Inductive biases and the geometry of concept learning

Rote Memorization or Intelligence: An Assessment of Inferential Reasoning in Large Language Models

Experiential Traces: A Framework for Empirical Investigation of Machine Cognition Through Reasoning Block Analysis