Sport Classification from Multi-Player Trajectories with Set-over-Time Aggregation
Discuss this preprint
Start a discussion What are Sciety discussions?Listed in
This article is not in any list yet, why not save it to one of your lists.Abstract
Sport classification is commonly studied from RGB video, but multi-player trajectories also provide a compact representation of movement, spacing, and interaction patterns. This paper investigates sport classification in a trajectory-only setting and introduces a set-over-time formulation for multi-player inputs. Each clip is represented as a sequence of frames, where each frame contains an unordered set of visible player trajectories, and the method is evaluated on a four-sport classification task built from MultiSports actor tubes. The proposed model first encodes the player set in each frame with permutation-invariant aggregation and then performs temporal aggregation for clip-level prediction. This design preserves frame-level multi-player structure before temporal summarization. Experiments compare the proposed approach with three compact trajectory baselines: mean pooling, mean-plus-standard-deviation pooling, and GRU-based aggregation. The proposed model achieves the best validation macro-F1 of 0.8793 and test macro-F1 of 0.8614, outperforming the strongest baseline by 0.0386 test macro-F1. Ablation results further show that entity weighting, temporal variability modeling, time attention, and the joint use of position and motion cues all contribute to performance. Error analysis indicates that trajectory-only recognition is effective, but confusion remains among team sports with partially similar collective motion patterns. Overall, the results show that set-over-time aggregation is an effective approach for sport classification from multi-player trajectories.