Feature Fusion Units for Fine-grained Image Categorization
Discuss this preprint
Start a discussion What are Sciety discussions?Listed in
This article is not in any list yet, why not save it to one of your lists.Abstract
Fine-grained image categorization aims to categorize subclasses by processing detailed features, which is still a critical problem to be solved in computer version due to the small differences between subclasses. The traditional methods are usually to find features by manual annotation, using specific sliding Windows, using different thresholds and other methods. These methods are not only costly, but also ineffective. In computer version, by calculating attention scores between parts of the picture multiple times and weighting them, the transformer greatly improves the accuracy of categorization. In this paper, we propose a feature weight units. Specifically, transformer is used as the backbone to capture image feature(these features are called patches in transformer), and then all patches are weighted by our feature weight unit. The computal result of feature fusion unit represents the importance of the patch should to be forced on. To verify the effectiveness of our method, we conducted experiments on the CUB-200-2011 and stanford-dog datasets.