Conditions Underlying Success and Failure of Compositional Generalization in Distributional Models of Language
Listed in
This article is not in any list yet, why not save it to one of your lists.Abstract
Are distributional learning mechanisms capable of complex linguistic inferencesrequiring compositional generalization? This question has become contentious withthe development of large language models, which mimic human language abilities inmany ways, but which struggle with compositional generalization. We investigateda set of qualitatively different distributional models (word co-occurrence models,graphical models, recurrent neural networks, and transformers), by training themon a carefully controlled artificial language containing combinatorial dependenciesinvolving multiple words, and then testing them on novel sequences containingdistributionally overlapping combinatorial dependencies. In this work, we show thatgraphical network models and transformer models, but not co-occurrence space modelsand recurrent neural networks, were able to perform compositional generalization.The work demonstrates that some distributional models can perform compositionalgeneralization, and their success or failure turns on their ability of the models torepresenting words both individually and as a part of the phrases in which theyparticipated.