MVCRNet: A Semi-Supervised Multi-View Framework for Robust Animal Pose Estimation with Minimal Labeled Data
Listed in
This article is not in any list yet, why not save it to one of your lists.Abstract
Due to its high dependence on labeled data, animal pose estimation faces significant challenges in practical applications. Effectively utilizing large amounts of unlabeled data to improve the accuracy of pose estimation has become an urgent research topic to be addressed. In this paper, we propose the MVCRNet, which is designed to label a minimal amount of data at a low cost, while effectively utilizing large amounts of unlabeled data for accurate animal pose estimation. We first train an initial model on the limited set of labeled samples, and then use this model to generate preliminary pseudo-labels for the unlabeled data. Given the inevitable presence of noise in these pseudo-labels, the framework adopts the small loss trick coupled with confidence evaluation to retain only those pseudo-labels that satisfy both low loss and high confidence. However, this approach may be overly cautious, potentially discarding pseudo-labels of inferior quality that still contain learning value. To address this problem, we innovatively incorporate a multi-view based triple consistency checking strategy aimed at detecting and relabeling pseudo-labels containing potential learning value within high-loss, low-confidence samples, and a teacher-student consistency constraint strategy that further strengthens the stability and accuracy of the model throughout the training process. Experimental evaluations on two challenging datasets, AP-10K and Zebra, show that our method consistently outperforms existing semi-supervised algorithms, demonstrating its effectiveness and sophistication for animal pose estimation under data-constrained conditions. Our code and datasets are available at https://github.com/cf2xh123/MVCRNet.