Food Classification for Dietary Support Using Fine-Grained Visual Recognition with the HERBS Network

Read the full article See related articles

Listed in

This article is not in any list yet, why not save it to one of your lists.
Log in to save this article

Abstract

Background: Food properties can directly influence individual dietary intake. Therefore, computer vision-based food recognition could be used to estimate meal contents for patients with metabolic diseases. Food recognition based on deep learning can create opportunities for breakthroughs in dietary interventions for personal health management and have rapidly emerged as dietary assessment strategies. Methods: This study proposed a methodology for automatic food recognition based on fine-grained visual classification, the High-temperaturE Refinement and Background Suppression network (HERBS). A technical investigation was conducted involved in the HERBS model on CNFOOD241 benchmark to verify the model effectiveness. The visualization analysis of HERBS was compared with the VGG16 and RepViT on CNFOOD241. Results: The system achieved classification accuracy of 82.72 % and 97.19 % in Top-1 accuracy and Top-5 accuracy respectively. Data showed that HERBS structure widened the attention area and outperformed VGG16 and RepViT on feature maps. Conclusions: Our findings elucidated that the proposed methodology using HERBS model establishes a new benchmark for SOTA performance in food recognition on CNFOOD 241 dataset. This study proved the feasibility of the proposed approach in a challenging food dataset. We further described a vision-language model structure consisting of a multi-modal vision model, language encoder and a multilayer perceptron classifier for food composition recognition.

Article activity feed