Diving Performance Analysis with 3D Motion Knowledge Hypergraphs
Listed in
This article is not in any list yet, why not save it to one of your lists.Abstract
Diving actions involve complex temporal dynamics, rapid pose transitions, and strict requirements for entry angles and splash control, making quality assessment a challenging task in computer vision. Existing methods still face limitations in motion structure modeling, depth perception, and multimodal fusion. This paper proposes a multimodal scoring framework that integrates 3D pose reconstruction with a hypergraph neural network to enhance the modeling and evaluation of diving actions. We are the first to introduce 3D pose reconstruction into diving score assessment, compensating for the depth perception limitations of 2D vision by constructing complete 3D motion trajectories. To mitigate keypoint errors caused by rapid movements or occlusions, we propose a hypergraph-based spatiotemporal pose fusion model. This model leverages three types of hyperedges—temporal, skeletal, and joint—to build high-order spatiotemporal representations, and incorporates an attention mechanism to adaptively adjust their weights. To capture visual cues such as entry angles and splash patterns, we further design a multimodal fusion module that combines skeletal features with appearance features, significantly enhancing the model’s ability to perceive fine details. To address the lack of structured and fine-grained annotations in existing datasets, we also construct the Individual-Diving dataset, which contains 1,023 diving video clips covering 27 action classes, 26 sub-actions, along with frame-wise 3D pose annotations and official scores. Experimental results on the FineDiving and Individual-Diving datasets show that our method consistently outperforms previous approaches such as USDL and CoRe, demonstrating competitive performance in diving action modeling and quality assessment.