CASTLE: a training-free foundation-model pipeline for cross-species behavioral classification

Read the full article See related articles

Discuss this preprint

Start a discussion What are Sciety discussions?

Listed in

This article is not in any list yet, why not save it to one of your lists.
Log in to save this article

Abstract

Accurately and efficiently quantifying animal behavior at scale without intensive manual labeling is a long-standing challenge for neuroscience and ethology. Keypoint-based tracking emphasizes simplicity and efficiency but loses the richness of posture and context, while emerging foundation models capture pixel-level details, yet often require nontrivial efforts of retraining and can be more sensitive to backgrounds or lighting. Here, we present CASTLE , a training-free pipeline that addresses all these issues by synergistically combining foundation models for segmentation, tracking, and feature extraction. By isolating regions-of-interest (ROI), CASTLE first generates "focused (ROI-masked)” and orientation-invariant latent features, capturing rich postural details in zero-shot, fine-tuning-free manners. Following ROI isolation, CASTLE, through an interactive “Behavior Microscope” module, supports hierarchical clustering, for progressive, human-in-the-loop embedding and clustering. This enables raw-image-assisted discovery of behavioral classes without predefined categories. Across mice, Drosophila and C. elegans , CASTLE matches expert class annotations (>90%), reveals disease-relevant phenotypes in Parkinsonian mouse models. By eliminating purpose-specific model training and providing a raw-image-informed accessible workflow, CASTLE offers a scalable framework for interpretable, cross-species behavioral phenotyping.

Article activity feed