Feature Transformer and LightGBM Ensemble for Ship Trajectory Recognition Using Real AIS Data
Discuss this preprint
Start a discussion What are Sciety discussions?Listed in
This article is not in any list yet, why not save it to one of your lists.Abstract
The Automatic Identification System (AIS) generates massive volumes of real-world ship trajectory data, providing a critical foundation for maritime ship-type classification. However, existing methods often struggle to simultaneously capture long-range temporal dependencies, maintain computational efficiency, and ensure model interpretability, making accurate multi-class classification challenging in real-world maritime environments. To address these limitations, this study proposes a robust and efficient hybrid framework that integrates a Feature Transformer module for deep temporal feature extraction with a LightGBM model for ensemble classification. The multi-head self-attention within the Feature Transformer captures long-range dependencies in preprocessed AIS sequences to generate compact 64-dimensional trajectory fingerprints. These deep representations are concatenated with 103 carefully designed kinematic, geometric, statistical, frequency-domain, and segment-level features and fed into a LightGBM classifier for final ship-type identification. We evaluate the framework on a real-world AIS dataset of 2196 trajectories collected between 2019 and 2023, covering 14 ship types under a natural long-tail distribution. Across five random seeds, the proposed hybrid model achieves 78.06% ± 1.15% accuracy (95% CI) and 74.09% ± 1.82% Macro-F1 (95% CI), significantly outperforming Transformer-only (65.09% accuracy) and LightGBM-only (66.85%) baselines, with paired statistical tests confirming the improvement (McNemar χ2 = 172.07, p < 10−39 vs. Transformer; χ2 = 92.24, p < 10−21 vs. LightGBM). The hybrid model offers ultra-fast inference at 0.051 ms per trajectory on GPU at batch size 128 (≈19,500 samples/s), and provides instance-level interpretability via SHapley Additive exPlanations (SHAP) analysis. These properties make the framework practical for near-real-time maritime traffic monitoring and decision support.