Multi-view Learning for Camouflaged Object Detection with PVTv2
Discuss this preprint
Start a discussion What are Sciety discussions?Listed in
This article is not in any list yet, why not save it to one of your lists.Abstract
Recently, with the continuous development in the field of camouflage object detection(COD), effectively separating objects highly similar to the background has become a focal point of research. Due to the high similarity between camouflage objects and backgrounds, traditional single visual branch often perform poorly in such scenarios. To address this issue, We propose a multi-view learning detection network based on the Pyramid Vision Transformer, named Multi-view Learning for Camouflaged Object Detection with PVTv2(MVLNet). By utilizing the information from RGB and noise views, our method can provide a more comprehensive description of the relationship between objects and backgrounds to improve the accuracy and robustness for COD. Inspired by human visual attention during observation, we design a Global Context Aggregation Module by using a U-shaped structure and progressively increasing dilation rates to simulate the human behavior of zooming in and out. Extensive experiments demonstrate that the proposed MVLNet outperforms 22 other representative models on three public datasets.