Few-Shot Semantic Segmentation of Batik Patterns via Attention-Weighted Hierarchical Decoding
Discuss this preprint
Start a discussion What are Sciety discussions?Listed in
This article is not in any list yet, why not save it to one of your lists.Abstract
The digital preservation of batik, a world intangible cultural heritage, is hindered by the difficulty in performing accurate semantic segmentation on its complex patterns with limited annotated samples. To address this few-shot learning challenge, we constructed a few-shot batik pattern dataset, and proposed a novel network architecture centered on attention weighting and hierarchical decoding. Our method leverages a pre-trained ResNet101 backbone for transfer learning to establish a strong feature foundation. It incorporates a dual-attention module that combines spatial and channel attention to dynamically highlight semantically rich regions and intricate texture boundaries specific to batik. For multi-scale context aggregation, a lightweight module utilizing parallel dilated convolutions is introduced to efficiently capture features from varying receptive fields. Finally, a hierarchical decoder progressively integrates these enhanced, multi-scale features with high-resolution shallow features to reconstruct precise segmentation maps. Comprehensive evaluations on a dedicated batik dataset show that our model achieves state-of-the-art performance, with a mean Intersection over Union (mIoU) of 79.22% and a Pixel Accuracy (PA) of 92.47%. It notably improves over the strong DeepLabV3+ baseline by 3.3% in mIoU and 0.95% in PA, demonstrating its effectiveness for the task of batik pattern segmentation under data-scarce conditions.