DINOSim: Zero-Shot Object Detection and Semantic Segmentation on Microscopy Images

Read the full article See related articles

Discuss this preprint

Start a discussion What are Sciety discussions?

Listed in

This article is not in any list yet, why not save it to one of your lists.
Log in to save this article

Abstract

We present DINOSim, a novel method for detecting and segmenting objects in microscopy images without the need for large annotated datasets or additional training. DINOSim builds on the pretrained DINOv2 image encoder, which captures semantic information from images. By comparing the encoder’s features of images patches to those of a user-selected reference, DINOSim generates pseudo-labels that guide object detection and segmentation. Subsequently, a k-nearest neighbors framework is then used to refine predictions across new images. Our experiments show that DINOSim can effectively identify and segment previously unseen objects in diverse microscopy datasets, offering performance comparable to supervised approaches while avoiding the need for costly manual labeling. We also investigate how different choices of user prompts selection and model size affect accuracy and generalization. To make the method widely accessible, we provide an open-source Napari plugin ( github.com/AAitorG/napari-DINOSim ), enabling researchers to easily apply DINOSim to their own data. Overall, DINOSim offers a fast, flexible and practical solution for bioimage analysis, particularly valuable in resource-constrained settings.

Article activity feed