Geocoding historical census data for Stockholm, 1878-1950
Discuss this preprint
Start a discussion What are Sciety discussions?Listed in
This article is not in any list yet, why not save it to one of your lists.Abstract
This paper describes the methodology for geocoding historical Swedish census data from 1878-1950, developed as part of the research project "Cities and Socioeconomic Segregation in the Long Term, 1880-2017." While Sweden's historical demographic data is renowned for its quality and coverage, geocoding this data presents unique challenges that cannot be solved using modern geocoding APIs. The primary obstacles include temporal changes in address-coordinate relationships, street relocations, and the complete disappearance of spatial units through demolition and redevelopment. Historical Swedish census data varies in geographic precision, ranging from village-level information in rural areas to property-level detail in urban centers like Stockholm. The paper proposes a solution based on constructing a "canonical" historical address and property database that incorporates temporal dimensions, allowing for accurate matching at specific time points. This database is compiled from multiple sources and validated against georeferenced historical city maps. The methodology addresses the distinction between property-level (block name and number) and address-level (street name and house number) geocoding, with property coordinates proving more temporally stable. Manual data collection and quality assurance are essential components of the process, particularly for areas subject to major urban redevelopment such as Stockholm's Klara neighborhood. This approach enables accurate geocoding of historical census data while maintaining spatial precision appropriate for demographic analysis of urban segregation patterns over more than a century.