Improving Network Level Pedestrian Activity Prediction by Accounting for Spatial and Longitudinal Clustering in Crowdsourced Data
Discuss this preprint
Start a discussion What are Sciety discussions?Listed in
This article is not in any list yet, why not save it to one of your lists.Abstract
Crowdsourced data are increasingly used to estimate active transportation volumes across large networks. Although bicycling data are widely available from third-party aggregators, comparable pedestrian data have been limited. Until recently, most pedestrian datasets captured only footfall at points of interest. Strava Metro began releasing link-level pedestrian activity data last year, and evaluations of these data remain scarce. This study is among the first to use Strava’s pedestrian activity data and assess their validity against at-location counter measurements. We then develop a network-level pedestrian volume model that integrates Strava data with land use, weather, infrastructure, and facilities variables. We also address a methodological gap in the prevalent models using crowdsourced data, where models often ignore variability in longitudinal and spatially clustered data. Using a mixed-effects predictive framework, we capture this structure and improve predictive performance. Our findings introduce a validated pedestrian data source and a robust model for network-level pedestrian volume estimation.