A Functional Land Use Regression Model for NO2 Concentration in the Italian Alpine Region
Discuss this preprint
Start a discussion What are Sciety discussions?Listed in
This article is not in any list yet, why not save it to one of your lists.Abstract
Air pollution is a key risk factor associated to adverse health outcomes. Although pollution levels are typically measured at limited locations where air monitoring stations are placed, it is important to have some knowledge of the spatial distribution of pollution to assess the exposure of (groups of) individuals. Furthermore, since pollution levels are influenced by local sources such as traffic, industrial activities, and land use, it is of interest to enrich the prediction of pollution levels using information on these explanatory variables. Land Use Regression (LUR) models have emerged as effective tools for this purpose, relating pollutant concentrations to geographical characteristics. However, traditional LUR models often focus on average concentrations, overlooking crucial intra-day variability that can impact health outcomes. To address this we introduce a novel functional LUR (FLUR) model designed to estimate hourly NO2 concentrations. This approach treats hourly pollutant measurements as functional data, capturing the continuous temporal dynamics of daily pollution levels. We applied this model to a comprehensive dataset of hourly NO2 measurements from 41 air monitoring stations in the Italian Alpine foothills and mountains collected throughout the year 2023. Our functional penalized regression model considers the hourly daily profile of log-transformed NO2 concentrations as dependent functional variable and a combination of scalar and functional variables as predictors. These variables may be related to meteorological information, key spatial predictors derived from GIS, or structural factors such as day of the week and day of the year.The choice of which predictors should be included in the predictive model was carried out through an innovative forward selection approach adapted to functional data. The approach is based on maximising the adjusted explained variance while avoiding concurvity in the selected predictors. This is achieved by taking into consideration the direction of the daily functional effect. Five key predictors influencing hourly NO 2 concentrations were selected: all buildings within 1000 m, mean slope within 100 m, primary roads within 2500 m, herbaceous land cover within 2500 m and average wind speed. Each predictor exhibits distinct temporal patterns of influence throughout the day. Our preliminary results on the final model, validated using a Leave-One-Station-Out Cross-Validation (LOOCV) procedure, reported a good overall fit (adjusted R2 = 60.5%, LOOCV R 2 = 58.0%) and good predictive accuracy. Compared to standard methods, the FLUR model provides an improved understanding of NO 2 exposure by recording its hourly variations and spatial determinants in a complex topographical region.