Reconstruction of Postures Using Partial Body Information Through a Self-supervised Transformer in Mice

Read the full article See related articles

Listed in

This article is not in any list yet, why not save it to one of your lists.
Log in to save this article

Abstract

The comprehensive interpretation of behavior from incomplete data represents a fundamental challenge in computational ethology. Here we present Masked Autoencoder for Transformer-based Estimation and Reconstruction (MATER), a self-supervised learning framework that extracts behaviorally relevant representations from unlabeled rodent pose data by reconstructing strategically masked body keypoints. This approach captures fundamental movement patterns without requiring extensive manual annotation, addressing common experimental challenges including occlusions during social interactions and tracking errors. We evaluate MATER across two-dimensional and three-dimensional rodent pose datasets, demonstrating its robustness under high levels of keypoint masking. The framework achieves high-fidelity reconstructions under these challenging conditions and produces latent representations that support accurate behavioral classification with minimal supervision. Our analyses further reveal that rodent movement exhibits intrinsic spatiotemporal structure, which can be computationally inferred without explicit labeling. Reconstruction performance is tightly linked to the temporal coherence of movement, highlighting the importance of temporal dynamics in behavioral representation. These findings reinforce the emerging view that animal behavior is hierarchically organized and governed by natural temporal dependencies. MATER offers a robust, scalable tool for neuroscientists seeking to analyze complex, naturalistic behaviors across diverse experimental contexts, ultimately advancing our understanding of behavioral architecture and its neural underpinnings.

Article activity feed