Are Natural Language Data “Nature-Identical” and What Is Elicitation After All?
Discuss this preprint
Start a discussion What are Sciety discussions?Listed in
This article is not in any list yet, why not save it to one of your lists.Abstract
Language documentation discourse commonly divides language data into two large types: natural(istic) vs. elicited. The goal of this paper is to put this dichotomy under critical scrutiny. By examining key publications on linguistic fieldwork, I show that the two terms seldom receive any clear definition and are often used inconsistently, giving rise to evident contradictions. The analysis reveals that the terms are typically distinguished by two parameters – linguistic unit (texts vs. not texts) and context of language production (controlled vs. uncontrolled) – but the distinction is virtually never thoroughly maintained. I argue that the dichotomy natural(istic) vs. elicited is insufficient to capture the complexity of possible scenarios and forms under which language is produced. Building upon previous literature, I propose a more detailed classification of language data, which abandons the notion of ‘natural(istic)’ and ‘elicited’ altogether. The paper concludes by discussing the gains of a more careful reflection on language data types.