Success and Failure of Compositional Generalization in Distributional Models of Language

Shufan Mao
Philip Huebner
Jon Willits

Read the full article

Listed in

This article is not in any list yet, why not save it to one of your lists.

Abstract

Are distributional learning mechanisms capable of complex linguistic inferencesrequiring compositional generalization? This question has become contentious withthe development of large language models, which mimic human language abilities inmany ways, but which struggle with compositional generalization. We investigateda set of qualitatively different distributional models (word co-occurrence models,graphical models, recurrent neural networks, and transformers), by training themon a carefully controlled artificial language containing combinatorial dependenciesinvolving multiple words, and then testing them on novel sequences containingdistributionally overlapping combinatorial dependencies. In this work, we show thatgraphical network models and transformer models, but not co-occurrence space modelsand recurrent neural networks, were able to perform compositional generalization.The work demonstrates that some distributional models can perform compositionalgeneralization, and their success or failure turns on their ability of the models torepresenting words both individually and as a part of the phrases in which theyparticipated.

Version published to 10.31234/osf.io/ps84q on OSF Preprints
Oct 7, 2024

Verified Language Processing with Hybrid Explainability

This article has 3 authors:
1. Oliver Robert Fox
2. Giacomo Bergami
3. Graham Morgan
This article has no evaluationsLatest version May 16, 2025
Structured Reasoning with Large Language Models

This article has 1 author:
1. Srihari Tanmay Karthik Tadala
This article has no evaluationsLatest version May 28, 2025
A Comparative Survey of Large Language Models: Foundation, Instruction-Tuned, and Multimodal Variants

This article has 2 authors:
1. Owen Graham
2. Jim Balford
This article has no evaluationsLatest version Jun 13, 2025

Listed in

Abstract

Article activity feed

Related articles

Verified Language Processing with Hybrid Explainability

Structured Reasoning with Large Language Models

A Comparative Survey of Large Language Models: Foundation, Instruction-Tuned, and Multimodal Variants