C2-Net: Improving Feature Extraction and Alignment for Few-Shot Fine-Grained Image Classification
Listed in
This article is not in any list yet, why not save it to one of your lists.Abstract
Few-shot fine-grained image classification (FS-FGIC) intends on improving the ability to classify detailed image categories with limited training samples. However, two major challenges still exist. The challenges include effectively extracting the essential features needed for fine-grained classification while minimizing irrelevant noise, which can cause overfitting when dealing with few-shot conditions. The second challenge lies in achieving robust feature alignment between the support and query samples, especially when there are spatial variations, such as differences in the positions or angles of the objects. This paper introduces C2-Net to address these issues. This innovative framework includes two key modules designed to overcome these challenges. The Cross-Layer Feature Refinement (CLFR) module has an impact on the quality of features. It does this by blending outputs from several layers of the network. This approach helps to cut down noise at the sample level. At the same time, the Cross-Sample Feature Adjustment (CSFA) module changes to fit spatial and channel differences. This makes sure that features line up between the few support and query samples. Through these mechanisms, C2-Net reduces misalignments and improves feature discrimination. Comprehensive experiments conducted on five benchmark datasets demonstrate that C2-Net continously exceeds existing methods, achieving state-of-the-art (SOTA) results in most cases, such as improved One-shot classification accuracy on the CUB dataset from 54.87% to 76.51% and 5-shot accuracy from 79.09% to 88.15%. This approach represents a significant advancement in tackling the challenges of FS-FGIC.