GLFormer: A Global-Local Spatial-Temporal Dependency Interaction Transformer with Hierarchical Fusion for Traffic Flow Prediction

Read the full article See related articles

Discuss this preprint

Start a discussion What are Sciety discussions?

Listed in

This article is not in any list yet, why not save it to one of your lists.
Log in to save this article

Abstract

With the accelerating pace of urbanization, traffic flow prediction plays a vital role in alleviating congestion, improving road utilization efficiency, and advancing the development of smart cities.Accurate traffic flow prediction not only enhances the operational efficiency of transportation systems but also supports sustainable urban mobility and emergency management to a certain extent.However, existing traffic flow prediction models still face several challenges in spatial-temporal dependency modeling, primarily including: 1) Traffic flow exhibits dynamic variations, yet current methods lack the ability to adaptively adjust the weights of embedded information when addressing real-world factors such as varying traffic densities across road segments and periodic temporal patterns; 2) When capturing complex temporal dependencies, effective mechanisms for long-term modeling are insufficient, making it difficult to jointly account for both long-term and short-term temporal dependencies; 3) Limitations persist in the interaction and fusion of spatial-temporal features, hindering the ability to adequately model the deep interactions between temporal and spatial characteristics.To address these challenges, this paper proposes a method termed GLFormer, which models global and local spatial-temporal dependencies through a hierarchical interaction fusion strategy.The method employs an improved data embedding technique to enable dynamic adjustment of the weights assigned to different types of spatial-temporal information;For temporal modeling, a Global Temporal Self-Attention (GLoTSA) module is designed by integrating self-attention with global temporal convolution to capture multi-scale temporal dependencies;For spatial-temporal feature fusion, a hierarchical interaction fusion mechanism is constructed to progressively enhance the deep interactions between temporal and spatial features.Experiments conducted on three public datasets demonstrate that GLFormer outperforms several state-of-the-art baseline models in terms of mean absolute error (MAE), root mean square error (RMSE), and mean absolute percentage error (MAPE), with average improvements of 16.05%, 18.31%, and 12.67%, respectively.The key source code and data are available at https://github.com/lsy-study/GLFormer.

Article activity feed