An Investigation of Comparative Correlative Constructions in Auto-regressive Large Language Models: From Construction Grammar to Computational Understanding
Listed in
This article is not in any list yet, why not save it to one of your lists.Abstract
This paper investigates how autoregressive large language models (LLMs) process and understand Comparative Correlative constructions (CCs). Through a series of systematic experiments examining both syntactic and semantic behaviors of LLMs, we evaluate the performance of leading LLMs including ChatGPT-4o, ChatGPT-3.5, Claude-3.5 Sonnet, and Claude-3.0 Opus. Based on a dataset of n = 5300 datapoints, the results reveal significant challenges in these models’ capabilities in handling complex syntactic structures and logical reasoning. While models demonstrate high accuracy in recognizing syntactic patterns of CCs, they exhibit pronounced sensitivity to surface-level features, particularly name order variations in semantic interpretation tasks. More critically, our experiments show that models struggle with counterintuitive premises that conflict with common-sense knowledge. When explicit CC premises are omitted entirely, performance deteriorates further, with newer models showing the most dramatic declines. These findings indicate that while models achieve high accuracy on surface-level pattern recognition, they rely more on statistical patterns from pre-training than genuine logical inference, even when provided with explicit premises. This paper extends previous research on LLMs’ processing of CC structures, revealing fundamental disparities between surface-level pattern recognition and genuine logical reasoning—findings that deepen our understanding of the cognitive architecture underlying LLMs.