Beyond Proximity: Constructing Organic Neighborhoods Using a Two-Stage Unsupervised Learning Approach
Listed in
This article is not in any list yet, why not save it to one of your lists.Abstract
Studying the relationship between neighborhoods and individual-level outcomes such as crime, labor market success, or intergenerational mobility has a long history in the social sciences. As local processes such as gentrification or residential mobility constantly change neighborhoods’ composition and spatial expansion, time-constant one-size-fits-all neighborhood measures fail to capture important local dynamics. This paper presents a flexible and data-driven approach for efficiently estimating overlapping and arbitrarily shaped neighborhoods with time-dynamic boundaries. Constructed in a two-stage clustering design, the first stage identifies homogeneous groups within a city (using an automated K-Means algorithm), while the second stage clusters homogeneous groups by spatial proximity (using the HDBSCAN algorithm). In an analysis of 86 million person-year observations from 76 German cities, the paper shows that a larger spatial expansion of neighborhoods with a high socioeconomic status negatively correlates with city crime cases, while higher neighborhood fragmentation and heterogeneity correlate positively with crime rates. The findings stress the importance of flexible neighborhood estimation techniques and the necessity to view neighborhoods as non-constant entities. By modeling contexts as such agentic players, the two-staged algorithm depicts a novel and transparent tool to consider the spatial embeddedness of individuals, firms, or regions in sociological research.