Functional factors predict referential choice similarly across languages: A cross-linguistic computational analysis

Read the full article

Listed in

This article is not in any list yet, why not save it to one of your lists.
Log in to save this article

Abstract

Speakers' form choices on how to express referents in connected discourse are driven by a wide variety of considerations. Yet, it is to date unclear how precisely relevant factors are shaping referential choice and which of these are most important during discourse production across languages. Using computational modeling on a naturalistic corpus of spoken narratives across 16 diverse languages, we systematically investigated the predictability of 16 functional factors on referential choice. We moreover used a novel technique from explainable artificial intelligence to get precise insights into the decision making processes of our models, in order to draw inferences on the precise relationship of the factors with referential choice. We show that 1) referential choice is multifactorial, 2) anaphoric distance, referent salience in local discourse, and syntactic function a referent is allocated to, are most predictive of referential form, 3) many other factors discussed prominently in the literature, like referent competition, topic persistence or the presence of agreement, have little to no predictive power, 4) and that these factors are generally shared across languages, with large differences across languages mostly pertaining to idiosyncratic properties of languages, such as word order or the expression of nominal number. Our study is the first large-scale study of referential choice across diverse languages using naturalistic corpus data.

Article activity feed