Inception-enabled Vision Transformer (ViT)-based Model for Plant Disease Identification
Listed in
This article is not in any list yet, why not save it to one of your lists.Abstract
The timely and precise identification of diseases in plants is essential for efficient disease control and safeguarding of crops. Manual identification of diseases requires expert knowledge in the field, and finding people with domain knowledge is challenging. To overcome the challenge, computer vision-based machine learning techniques have been proposed by the researchers in recent years. Most of these solutions with the standard convolutional neural network (CNN) approaches use uniform background laboratory setup leaf images to identify the diseases. However, only a few works considered real-field images in their work. Therefore, there is a need for a robust CNN architecture that can identify the diseases in plants in both laboratory and real-field conditioned images. In this paper, we have proposed an Inception-enabled vision transformer (ViT) architecture to identify the diseases in plants. The proposed Inception-enabled ViT architecture extracts local as well as global features, which improves feature learning. The use of multiple filters with different kernel sizes efficiently use computing resources to extract relevant features without the need for deeper networks. The robustness of the proposed architecture is established by hyper-parameter tuning and comparison with state-of-the-art. In the experiment, we consider five datasets with both laboratory-conditioned and real-field conditioned images. From the experimental results, we see that the proposed model outperforms state-of-the-art deep learning models with fewer parameters. The proposed model archives an accuracy rate of 99.17% for the apple leaf dataset, 99.32% for the rice dataset, 96.89% for the ibean dataset, 75.42% for the cassava leaf dataset, and 99.33% for the plantvillage dataset.