Improving the Generalization of Segmentation Foundation Models via Weakly-Supervised and Unsupervised Adaptation

Haojie Zhang
Yongyi Su
Nanqing Liu
Shijie Li
Xulei Yang
Xiangyu Yue
Kui Jia
Xun Xu

Read the full article

Discuss this preprint

Start a discussion What are Sciety discussions?

Listed in

This article is not in any list yet, why not save it to one of your lists.

Abstract

The success of large language models has inspired the computer vision community to explore image segmentation foundation model that is able to zero/few-shot generalize through prompt engineering. Segment-Anything (SAM), among others, is the state-of-the-art image segmentation foundation model demonstrating strong zero/few-shot generalization. Despite the success, recent studies reveal the weakness of SAM under strong distribution shift. In particular, SAM performs awkwardly on corrupted natural images, camouflaged images, medical images, etc. Motivated by these observations, we aim to develop an adaptation strategy for SAM that supports both weakly-supervised and unsupervised settings. In the weakly-supervised setting, we leverage weak labels, e.g. point-wise or box annotations, together with anchor model and low-rank finetuning to regularize self-training to improve generalization. In the unsupervised setting, we propose a data pipeline to automatically generate weak labels for target domain training images, enabling adaptation without manual annotation. To further alleviate error accumulation in self-training, we introduce patch-level contrastive regularization to reduce reliance on noisy pseudo labels, and employ a novel masked image modeling approach that uses teacher-derived features and semantic alignment to improve feature consistency and robustness during adaptation. We conduct extensive validation on five segmentation tasks across diverse domains, including natural, corrupted, medical, camouflaged, and robotic images. Our task-agnostic method, compatible with both SAM and SAM2, consistently surpasses pre-trained SAM and state-of-the-art domain adaptation methods across four segmentation settings using identical prompt inputs.

Version published to 10.20944/preprints202510.1640.v1
Oct 22, 2025

Optimising Few-Shot Class-Incremental Learning for Fine-Grained Visual Recognition

This article has 5 authors:
1. Yimin Yin
2. Haoling Liu
3. Sihang Xu
4. Renye Zhang
5. Jinghua Zhang
This article has no evaluationsLatest version Oct 6, 2025
Advancing HGV Detection with Limited Data: A Semantic Segmentation Framework Using SLiMe

This article has 4 authors:
1. Neda Darbandsari
2. Mo Saraee
3. Soheil Bazanjani
4. Taha Mansouri
This article has no evaluationsLatest version Oct 20, 2025
Independent Benchmarking of Prompt-Based Medical Segmentation Models

This article has 8 authors:
1. Ayhan Can Erdur
2. Daniel Scholz
3. Josef A. Buchner
4. Denise Bernhardt
5. Stephanie E. Combs
6. Benedikt Wiestler
7. Daniel Rueckert
8. Jan C. Peeken
This article has no evaluationsLatest version Oct 10, 2025

Discuss this preprint

Listed in

Abstract

Article activity feed

Related articles

Optimising Few-Shot Class-Incremental Learning for Fine-Grained Visual Recognition

Advancing HGV Detection with Limited Data: A Semantic Segmentation Framework Using SLiMe

Independent Benchmarking of Prompt-Based Medical Segmentation Models