Authorship Identity and Spatiality: Social Influences on Text Production
Listed in
This article is not in any list yet, why not save it to one of your lists.Abstract
Computational text analysis has grown in popularity among social scientists due to the massive influx of digitized data. However, connecting text to authorship could be a boon for digital demography and expand the scope of computational text analysis from trends of what is being written toward social patterning of the people producing it. We explore this potential through examinations of a large corpus of college admissions essays (n = 254,820 essays submitted by 83,538 applicants) and show how personal identity markers and ZIP code-level social context data influence large scale processes of textual production. After generating numerical representations of the essays using computational methods, we model the relationships between different identity and spatial characteristics of applicants and their local communities. We find strong relationships between identity and spatial features with the essays. We also find that individuals whose personal identities are spatially unique--that is, demographically different from others in their immediate content--were most likely to be misclassified, indicating that writing is influenced both socially and spatially. This work clarifies how authorship characteristics shape large scale textual production processes, like college admissions, and complements other large scale analyses of text by focusing on authorship rather than purely textual patterns.