Feature Fusion Units for Fine-grained Image Categorization

Read the full article See related articles

Discuss this preprint

Start a discussion What are Sciety discussions?

Listed in

This article is not in any list yet, why not save it to one of your lists.
Log in to save this article

Abstract

Fine-grained image categorization aims to categorize subclasses by processing detailed features, which is still a critical problem to be solved in computer version due to the small differences between subclasses. The traditional methods are usually to find features by manual annotation, using specific sliding Windows, using different thresholds and other methods. These methods are not only costly, but also ineffective. In computer version, by calculating attention scores between parts of the picture multiple times and weighting them, the transformer greatly improves the accuracy of categorization. In this paper, we propose a feature weight units. Specifically, transformer is used as the backbone to capture image feature(these features are called patches in transformer), and then all patches are weighted by our feature weight unit. The computal result of feature fusion unit represents the importance of the patch should to be forced on. To verify the effectiveness of our method, we conducted experiments on the CUB-200-2011 and stanford-dog datasets.

Article activity feed