Spatiotemporal Graph Autoencoder Network for Skeleton-Based Human Action Recognition

Read the full article See related articles

Listed in

This article is not in any list yet, why not save it to one of your lists.
Log in to save this article

Abstract

The task of human action recognition (HAR) based on skeleton data is a challenging yet crucial technique owing to its wide-ranging applications in numerous domains, including patient monitoring, security surveillance, and observation of human-machine interactions. While numerous algorithms have been proposed in an attempt to distinguish between a myriad of activities, most practical applications necessitate highly accurate detection of specific activity types. This study proposes a novel and highly accurate spatiotemporal graph autoencoder network for HAR based on skeleton data. Furthermore, an extensive investigation was conducted employing diverse modalities. To this end, a spatiotemporal graph autoencoder was constructed to automatically learn both spatial and temporal patterns from human skeleton datasets. The powerful graph convolutional network, designated as GA-GCN, developed in this study, notably outperforms the majority of existing state-of-the-art methods when evaluated on two common datasets, namely NTU RGB+D and NTU RGB+D 120. On the first dataset, the proposed approach achieved accuracies of 92.3\% and 96.8\% for the cross-subject and cross-view evaluations, respectively. On the more challenging NTU RGB+D 120 dataset, GA-GCN attained accuracies of 88.8\% and 90.4\% for the cross-subject and cross-set evaluations, respectively.

Article activity feed