ASBAR: an Animal Skeleton-Based Action Recognition framework. Recognizing great ape behaviors in the wild using pose estimation with domain adaptation

Read the full article See related articles


To date, the investigation and classification of animal behaviors have mostly relied on direct human observations or video recordings with posthoc analysis, which can be labor-intensive, time-consuming, and prone to human bias. Recent advances in machine learning for computer vision tasks, such as pose estimation and action recognition, thus have the potential to significantly improve and deepen our understanding of animal behavior. However, despite the increased availability of open-source toolboxes and large-scale datasets for animal pose estimation, their practical relevance for behavior recognition remains under-explored. In this paper, we propose an innovative framework, ASBAR , for Animal Skeleton-Based Action Recognition , which fully integrates animal pose estimation and behavior recognition. We demonstrate the use of this framework in a particularly challenging task: the classification of great ape natural behaviors in the wild. First, we built a robust pose estimator model leveraging OpenMonkeyChallenge, one of the largest available open-source primate pose datasets, through a benchmark analysis on several CNN models from DeepLabCut, integrated into our framework. Second, we extracted the great ape’s skeletal motion from the PanAf dataset, a large collection of in-the-wild videos of gorillas and chimpanzees annotated for natural behaviors, which we used to train and evaluate PoseConv3D from MMaction2, a second deep learning model fully integrated into our framework. We hereby classify behaviors into nine distinct categories and achieve a Top 1 accuracy of 74.98%, comparable to previous studies using video-based methods, while reducing the model’s input size by a factor of around 20. Additionally, we provide an open-source terminal-based GUI that integrates our full pipeline and release a set of 5,440 keypoint annotations to facilitate the replication of our results on other species and/or behaviors. All models, code, and data can be accessed at: .

Author summary

The study of animal behaviors has mostly relied on human observations and/or video analysis traditionally. In this paper, we introduce a new framework called ASBAR (for Animal Skeleton-Based Action Recognition ) that integrates recent advances in machine learning to classify animal behaviors from videos. Compared to other methods that use the entire video information, our approach relies on the detection of the animal’s pose (e.g., position of the head, eyes, limbs) from which the behavior can be recognized. We demonstrate its successful application in a challenging task for computers as it classifies nine great ape behaviors in their natural habitat with high accuracy. To facilitate its use for other researchers, we provide a graphical user interface (GUI) and annotated data to replicate our results for other animal species and/or behaviors.

Article activity feed