Calibrating Feature Representations for Few-shot Image Recognition via Vicinal Mixup
Listed in
This article is not in any list yet, why not save it to one of your lists.Abstract
The goal of few-shot image recognition (FSIR) is to identify novel categories with a small number of annotated samples by exploiting transferable knowledge from training data (base categories). Metric-based methods use these base categories to learn a feature embedding network, and then fix the embedding network to identify novel categories. However, due to discrepancies between concepts of novel and base categories, the fixed embedding network produces less distinguishable features for novel categories. To this end, we propose a vicinal Mixup method to calibrate feature representations of novel categories by fine-tuning the embedding network. Unlike traditional Mixup, which involves all samples in standard image recognition, the proposed method employs Mixup between novel and vicinal base samples to refine the embedding network. The proposed method generates plentiful mixed representations that enhance the feature learning of novel categories, resulting in better generalization ability. Moreover, a better initialization of a novel classifier than a random one is achieved with the class prototype, and it can be used for both inductive FSIR and transductive FSIR. Experimental results on four standard FSIR datasets and cross-domain FSIR datasets demonstrate the effectiveness of the proposed method.