TRACE: End-to-end temporal inference and annotation of animal behaviors from video
Discuss this preprint
Start a discussion What are Sciety discussions?Listed in
This article is not in any list yet, why not save it to one of your lists.Abstract
Quantitative analysis of animal behavior is fundamental to neuroscience and ethology but remains constrained by the scalability, subjectivity, and limited reproducibility of manual annotation. Most automated approaches infer behavior through predefined intermediate representations such as pose trajectories, which require task-specific design choices and often omit contextual visual information essential for behavioral interpretation. Here we introduce TRACE ( T emporal R ecognition of A nimal Behaviors C aptured from Vid e o), an end-to-end method with a graphical user interface for detecting and annotating animal behavior from raw video. TRACE leverages a transformer-based video encoder pretrained via self-supervised learning to extract hierarchical temporal features, combined with multi-scale temporal modeling to capture behaviors spanning diverse timescales. The method jointly predicts behavioral identity and temporal boundaries from continuous video recordings with high-speed inference. Across multiple behavioral datasets spanning different species and experimental contexts, TRACE demonstrates robust and generalizable performance, enabling scalable, context-aware analysis of animal behavior directly from video.