Measuring land-use mix with address-level census data
Discuss this preprint
Start a discussion What are Sciety discussions?Listed in
This article is not in any list yet, why not save it to one of your lists.Abstract
This paper introduces a data-driven framework to evaluate mixed land use in Brazilian cities using the National Address File for Statistical Purposes (CNEFE), an address-level dataset from the 2022 census that records over 110 million geocoded establishments. We treat CNEFE records as point-based observations of functional use and aggregate them into H3 hexagonal grids to compute local residential and non-residential shares. Building on this representation, we calculate two standard indices – the Entropy Index (EI) and the Herfindahl–Hirschman Index (HHI) – and propose two directional extensions: an adapted HHI (aHHI), which maps functional dominance to a [−1, 1] scale, and the Bidirectional Global-centered Balance Index (BGBI), which measures deviations from the citywide residential proportion. The method is implemented in an open R workflow that automates data retrieval, preprocessing, and index computation for any municipality. Applying it to six major metropolitan areas at two H3 resolutions, we show that EI and HHI behave as expected but are blind to the direction of homogeneity, whereas aHHI and BGBI clearly distinguish predominantly residential from predominantly non-residential cells and highlight areas that match the global functional balance. Cross-scale comparisons document non-trivial sensitivity to grid resolution, reinforcing the need to report and test scale in mixed land-use studies.