A novel spatiotemporal prediction approach to fill air pollution data gaps using mobile sensors, machine learning and citizen science techniques

Read the full article See related articles

Listed in

This article is not in any list yet, why not save it to one of your lists.
Log in to save this article

Abstract

Particulate Matter (PM) air pollution poses significant threats to public health. Existing models for predicting PM levels range from Chemical Transport Models to statistical approaches, with Machine Learning (ML) tools showing superior performance due to their ability to capture highly non-linear atmospheric responses. This research introduces a novel methodology leveraging ML tools to predict PM 2.5 levels at a fine spatial resolution of 30 metres and temporal scale of 10 seconds. The methodology aims to demonstrate its proficiency in estimating missing PM 2.5 measurements in urban areas that lack direct observational data. A hybrid dataset was curated from an intensive aerosol campaign in Selly Oak, Birmingham, UK, utilizing citizen scientists and low-cost Optical Particle Counters (OPCs) strategically placed in both static and mobile settings. Spatially resolved proxy variables, meteorological parameters, and aerosol properties were integrated, enabling a fine-grained analysis of PM 2.5 distribution along road segments. Calibration involved three approaches: Standard Random Forest Regression, Sensor Transferability Evaluation, and Road Transferability Evaluation. Results demonstrated high predictive accuracy (R 2  = 0.85, MAE = 1.60 µg m ³) for the standard RF model. Sensor and road transferability evaluations exhibited robust generalization capabilities across different sensors (best R 2  = 0.65, MAE = 2.76 µg m ³) and road types (R 2  = 0.71, MAE = 2.46 µg m ³), respectively. This methodology has the potential to significantly enhance spatial resolution beyond regulatory monitoring infrastructure, thereby refining air quality predictions and improving exposure assessments. The findings underscore the importance of ML-based approaches in advancing our understanding of PM 2.5 dynamics and their implications for public health. The paper has important implications for citizen science initiatives, as it suggests that the contributions of a small number of participants can significantly enhance our understanding of local air quality patterns for many 1000s of residents.

Article activity feed