High-Resolution NO<sub>2</sub>, O<sub>3</sub> and PMs Estimation in Puglia: Leveraging AI and Explainability Techniques

Read the full article See related articles

Discuss this preprint

Start a discussion What are Sciety discussions?

Listed in

This article is not in any list yet, why not save it to one of your lists.
Log in to save this article

Abstract

Air pollution remains a major environmental challenge, with severe impacts on human health and ecosystems. Recent advances in satellite technology have transformed air quality monitoring by enabling global, continuous observations of atmospheric pollutants. However, satellite data often lack the precision of ground-based stations. This study aims to develop a machine learning model to predict daily surface concentrations of key air pollutants (NO2, O3, PM10, PM2.5) at high spatial resolution (300 m) in the Apulia region. Using Regional Environmental Protection Agency (ARPA) station data from 2019 to 2022 and meteorological, geographic, land-use, and temporal variables, we trained an XGBoost model on a 300 m grid. Model performance, assessed by repeated cross-validation, showed an average R^2 of 0.71, with values of 0.77 for NO2, 0.78 for O3, 0.67 for PM2.5, and 0.64 for PM10. eXplainable AI (XAI) methods confirmed strong alignment with established scientific knowledge, enhancing model reliability and offering insights into pollutant distribution drivers.

Article activity feed