Language-Driven 3D Skeleton-Based Motion Generation with Action Nesting Graph
Listed in
This article is not in any list yet, why not save it to one of your lists.Abstract
To address the task of generating human motion from complex natural language instructions, this paper proposes a 3D skeleton-based motion generation method that integrates an action nesting graph. The method first constructs the action nesting graph through a language parsing module to capture the segmented structure of actions in the instruction. Then, a graph convolutional neural network is used to model the correspondence between the nested structure and the keyframes of the skeleton. A stage-wise decoupling module is introduced to improve the naturalness of motion transitions. On the KIT Motion and BEAT-Motion datasets, this method achieves improvements of 14.7% in structural preservation rate and 10.2% in stage boundary consistency. The results demonstrate that the proposed nesting-based modeling mechanism effectively enhances the model’s ability to interpret complex composite actions and improves the quality of motion generation