Towards Green Transportation: Predictive Modeling of Intersection Congestion Using Machine Learning for Sustainable Urban Traffic Management
Listed in
This article is not in any list yet, why not save it to one of your lists.Abstract
One of the main consequences of urban pollution is intersection congestion, which occurs due to frequent vehicle stops. These interruptions lead to increased fuel consumption and harmful gas emissions (CO, NO2, SO2, O3), along with other pollutants such as fine particulates (PM10, PM2.5). These pollutants can adversely affect the respiratory, cardiac, and neurological health of city residents. To address the growing demand for smart and sustainable transportation systems in large cities, predicting intersection congestion using artificial intelligence offers a promising solution. In this study, we present a predictive modeling approach to classify congestion levels at intersections controlled by traffic lights. Using the CN+ dataset collected in Bremen, Germany, our methodology incorporates vehicle and environmental features to predict congestion levels, optimize traffic flow, and reduce pollutant emissions. We employ data preprocessing, feature engineering, and machine learning techniques, including an innovative feature selection method called Dual Importance Intersection Feature Selection (DIFS), which combines Random Forest (RF) and Chi-square analysis. We tested various classifiers, including RF, XGBoost, LightGBM, CatBoost, and Artificial Neural Network (ANN), utilizing SMOTE balancing to address the class imbalance. The results indicate that RF achieved the highest overall F1-score (0.75) and QWK score (0.54), demonstrating its robustness in congestion classification. While ensemble methods such as XGBoost, LightGBM, and CatBoost exhibited competitive performance (F1-scores between 0.71 and 0.72), ANN lagged behind in terms of F1-score (0.69) and runtime efficiency. Among all models, RF not only delivered the best balance of precision, recall, and F1-score but also outperformed others in computational efficiency, making it a suitable choice for real-time congestion prediction. These findings highlight the importance of feature selection and model selection in achieving reliable traffic congestion forecasting. This makes our approach a robust tool for managing traffic sustainably and efficiently.