Comparative Analysis of Machine Learning Models for Multi-Horizon PM2.5 Forecasting

Shengqi Shao

Read the full article

Discuss this preprint

Start a discussion What are Sciety discussions?

Listed in

This article is not in any list yet, why not save it to one of your lists.

Abstract

Accurate forecasting of particulate matter (PM2.5) concentrations is critical for public health management and environmental policy-making. This study presents a comprehensive comparison of six machine learning models—Linear Regression, Support Vector Regression (SVR), Random Forest (RF), Gradient Boosting Decision Trees (GBDT), Multi-Layer Perceptron (MLP), and Long Short-Term Memory (LSTM)—for multi-horizon PM2.5 prediction. Using hourly air quality data from 11 cities in Zhejiang Province, China (January-February 2024), we evaluate model performance across three forecast horizons: 1-hour, 6-hour, and 24-hour ahead predictions. Our results demonstrate that model performance varies significantly with forecast horizon. For short-term (1-hour) predictions, Linear Regression achieves the best performance (RMSE=10.682, R²=0.901), suggesting near-linear temporal dynamics. For longer horizons (24-hour), ensemble tree-based models outperform others, with GBDT achieving RMSE=24.264 and R²=0.467. Surprisingly, deep learning approaches (LSTM) underperform traditional machine learning methods, particularly for long-term forecasting. Feature importance analysis reveals that the most recent PM2.5 value (lag-1) accounts for 47.8% of predictive power, while Air Quality Index contributes 42.3%, highlighting the dominance of temporal autocorrelation in PM2.5 dynamics.

Version published to 10.21203/rs.3.rs-8724199/v1 on Research Square
Jan 29, 2026

Machine Learning–Based Prediction of Particulate Matter and Gaseous Pollutants in Mega Cities

This article has 1 author:
1. Hümeyra Bolakar Tosun
This article has no evaluationsLatest version Feb 20, 2026
Comparative Study of Arima, Lstm and Prophet Models for Time Series Forecasting: A Comprehensive Review

This article has 1 author:
1. Hiteash Mahajan
This article has no evaluationsLatest version Jan 27, 2026
Dynamic Ensemble Learning with Explainability for Photovoltaic Power Prediction

This article has 6 authors:
1. Fethi Achouri
2. Fouzi Harrou
3. Mehdi Damou
4. Benamar Bouyeddou
5. Abdelhakim Dorbane
6. Ying Sun
This article has no evaluationsLatest version Feb 19, 2026

Discuss this preprint

Listed in

Abstract

Article activity feed

Related articles

Machine Learning–Based Prediction of Particulate Matter and Gaseous Pollutants in Mega Cities

Comparative Study of Arima, Lstm and Prophet Models for Time Series Forecasting: A Comprehensive Review

Dynamic Ensemble Learning with Explainability for Photovoltaic Power Prediction