Interpretable Multi-Horizon Machine Learning Framework for PM₂.₅ Forecasting in Tashkent: Toward Early-Warning Air Quality Management

Read the full article See related articles

Discuss this preprint

Start a discussion What are Sciety discussions?

Listed in

This article is not in any list yet, why not save it to one of your lists.
Log in to save this article

Abstract

Fine particulate matter (PM₂.₅) poses a major environmental and public health risk in Central Asia, yet predictive air-quality modeling remains limited due to fragmented monitoring networks and data scarcity. This study presents an interpretable multi-horizon machine-learning framework for PM₂.₅ forecasting in Tashkent, Uzbekistan, representing the first such analysis for the country. Six models, linear regression, ridge regression, LASSO, random forest, XGBoost, and long short-term memory (LSTM), were developed and evaluated under realistic data-limited conditions using hourly air-quality and meteorological observations. Forecasts were generated for three operational horizons (1 h, 24 h, and 168 h). Results show that short-term PM₂.₅ persistence dominates predictive skill, with XGBoost achieving the highest accuracy and stability, outperforming LSTM under fragmented datasets. Feature-selection and SHAP analyses provide transparent insight into dominant pollution drivers, enhancing policy relevance. Spatial aggregation across monitoring stations improves robustness for city-scale early-warning applications, albeit with reduced peak sensitivity. The proposed framework offers a data-efficient and interpretable pathway for operational air-quality management in emerging monitoring contexts.

Article activity feed