Self-supervised Component Segmentation To Improve Object Detection and Classification For Bumblebee Identification
Listed in
This article is not in any list yet, why not save it to one of your lists.Abstract
The performance of computer vision models for object detection and classification is heavily influenced by the number of classes and quality of input images, particularly in biological applications such as species-level identification of bumblebees. Bee identification is time-consuming, costly, and requires specialized taxonomic training. Different deep learning based computer vision models have been proven to overcome this methodological bottleneck through automated identification of bee species from captured images. However, accurate identification of bee species in images containing multiple objects of various classes poses significant challenges due to ambiguity, poor image quality, and noisy backgrounds. Existing pipelines (baselines) primarily rely on object detection to crop bees from images and classify the species for each cropped instance. This approach is limited by the inclusion of noisy backgrounds, low resolution, and poor image quality. To address these limitations, we propose an enhanced pipeline that integrates object detection with segmentation to generate body masks for bees and remove background noise. This process is complemented by a classification model that identifies the top k species for each masked image. The proposed methodology significantly improves both detection and classification performance in most cases, demonstrating its potential to advance automated identification of bee species in complex image datasets. For the cases where the baselines performed much better, we investigated using a state-of-the-art explainable AI model (Grad-CAM) to explain the reason.