Calibrating Feature Representations for Few-shot Image Recognition via Vicinal Mixup

Wuyuan Ye
Zhengdong Luo
Mengcheng Chen

Read the full article

Discuss this preprint

Start a discussion What are Sciety discussions?

Listed in

This article is not in any list yet, why not save it to one of your lists.

Abstract

The goal of few-shot image recognition (FSIR) is to identify novel categories with a small number of annotated samples by exploiting transferable knowledge from training data (base categories). Metric-based methods use these base categories to learn a feature embedding network, and then fix the embedding network to identify novel categories. However, due to discrepancies between concepts of novel and base categories, the fixed embedding network produces less distinguishable features for novel categories. To this end, we propose a vicinal Mixup method to calibrate feature representations of novel categories by fine-tuning the embedding network. Unlike traditional Mixup, which involves all samples in standard image recognition, the proposed method employs Mixup between novel and vicinal base samples to refine the embedding network. The proposed method generates plentiful mixed representations that enhance the feature learning of novel categories, resulting in better generalization ability. Moreover, a better initialization of a novel classifier than a random one is achieved with the class prototype, and it can be used for both inductive FSIR and transductive FSIR. Experimental results on four standard FSIR datasets and cross-domain FSIR datasets demonstrate the effectiveness of the proposed method.

Version published to 10.21203/rs.3.rs-6973369/v1 on Research Square
Aug 5, 2025

Hybrid Framework for Interpretable Deepfake Video Detection Using CapsNet and Transformer Encoders

This article has 5 authors:
1. Gargi Kadam
2. Sanika Tiwarekar
3. Yash Sonawane
4. Kailas Devadkar
5. Jignesh Sisodia
This article has no evaluationsLatest version Aug 21, 2025
Spectral Pyramid Pooling and Fused Keypoint Generation in ResNet-50 for Robust 3D Object Detection

This article has 3 authors:
1. R. Ramana
2. V. Vasudevan
3. B. S. Murugan
This article has no evaluationsLatest version Aug 21, 2025
HGACH: Hypergraph Attention Convolutional Hashing for Semi-supervised Cross-modal Retrieval

This article has 6 authors:
1. Fangming Zhong
2. Rui Zhang
3. Cun Zhu
4. Haiquan Yu
5. Chenglong Chu
6. Suhua Zhang
This article has no evaluationsLatest version Sep 24, 2025

Discuss this preprint

Listed in

Abstract

Article activity feed

Related articles

Hybrid Framework for Interpretable Deepfake Video Detection Using CapsNet and Transformer Encoders

Spectral Pyramid Pooling and Fused Keypoint Generation in ResNet-50 for Robust 3D Object Detection

HGACH: Hypergraph Attention Convolutional Hashing for Semi-supervised Cross-modal Retrieval