The Death of the Author, Reconsidered: Spatial and Demographic Constraints on College Admissions Essay Writing

AJ Alvero
anthony lising antonio
Leslie Luqueño
Francis Pearman

Read the full article

Discuss this preprint

Start a discussion What are Sciety discussions?

Listed in

This article is not in any list yet, why not save it to one of your lists.

Abstract

Computational text analysis has grown in popularity among social scientists due to the massive influx of digitized data available to study. However, much of this research disconnects patterns observed in text from information about the original authors. Eliding authorship considerations from sociological analysis of text can potentially lead to claims and assertions of trends that are independent from the social actors, conditions, interactions, and contexts which the text was produced. While text analysis without authorship information can yield reasonable inferences about society, complementing that approach with research that explicitly considers the people producing the text could expand the theoretical and empirical scope of work in this area. In this paper, we adapt perspectives from sociolinguistics and explicitly consider categorical identity markers of authors and geography as foundational axes of variation in textual data. We explore these dimensions in a large corpus of college admissions essays (n = 254,820 essays submitted by 83,538 applicants) and metadata about applicant identity, including the ZIP code of their high school. After generating features of the essays using computational methods, we find that author identity markers, such as gender, parental education, and socioeconomic status are highly salient. We also find that ZIP code level socioeconomic measures are extremely correlated with the writing style and content of local applicants. We also find that individuals whose personal identities are spatially unique–that is, demographically different from others in their immediate content–were most likely to be misclassified by our models, indicating that writing is influenced both socially and spatially. This work clarifies how authorship characteristics, like identity and spatial context, constrain the breadth of what we write and how we write by showing strong alignment between text and authors that is observable through machine reading of text.

Version published to 10.31235/osf.io/pt6b2_v3 on OSF Preprints
Aug 23, 2025
Version published to 10.31235/osf.io/pt6b2_v2 on OSF Preprints
Jan 31, 2025
Version published to 10.31235/osf.io/pt6b2_v1 on OSF Preprints
Sep 16, 2022

Multi-Scale Computational Analysis of Wikipedia’s Telling of Global History

This article has 7 authors:
1. Steph Buongiorno
2. Jo Guldi
3. Marnie Hughes-Warrington
4. Nan Jiang
5. Rosie Larson
6. Sohan Bellam
7. Gregory J. Palermo
This article has no evaluationsLatest version Jan 19, 2026
An Overview of Public Administration Research: Insights From Publications in Leading Journals (2000–2024)

This article has 1 author:
1. Vener Garayev
This article has no evaluationsLatest version Dec 12, 2025
The Table of Media Bias Elements: A sentence-level taxonomy of media bias types and propaganda techniques

This article has 2 authors:
1. Tim Menzner
2. Jochen L. Leidner
This article has no evaluationsLatest version Jan 13, 2026

Discuss this preprint

Listed in

Abstract

Article activity feed

Related articles

Multi-Scale Computational Analysis of Wikipedia’s Telling of Global History

An Overview of Public Administration Research: Insights From Publications in Leading Journals (2000–2024)

The Table of Media Bias Elements: A sentence-level taxonomy of media bias types and propaganda techniques