Optimized Vision Transformers for Superior Plant Disease Detection

被引:0
|
作者
Ouamane, Abdelmalik [1 ,2 ]
Chouchane, Ammar [2 ,3 ]
Himeur, Yassine [4 ]
Miniaoui, Sami [4 ]
Atalla, Shadi [4 ]
Mansoor, Wathiq [4 ]
Al-Ahmad, Hussain [4 ]
机构
[1] Univ Biskra, Lab LI3C, Biskra 07000, Algeria
[2] Agence Themat Rech Sci St ATRSS, Es Senia 31000, Algeria
[3] Univ Ctr Barika, Barika 05001, Algeria
[4] Univ Dubai, Coll Engn & Informat Technol, Dubai, U Arab Emirates
来源
IEEE ACCESS | 2025年 / 13卷
关键词
Plant disease detection; vision transformer; convolutional neural network; optimized ViT model; VGG 19 and AlexNet;
D O I
10.1109/ACCESS.2025.3547416
中图分类号
TP [自动化技术、计算机技术];
学科分类号
0812 ;
摘要
Detecting plant diseases is vital for maintaining agricultural productivity and ensuring food security. Advances in computer vision, particularly with Vision Transformers (ViTs), have shown significant potential in improving the accuracy and efficiency of plant disease identification. This study provides a comprehensive evaluation of various ViT parameters to determine the most effective configuration for this purpose. Using the extensive PlantVillage dataset, we systematically analyzed the effects of patch sizes, image resolutions, embedding dimensions, the number of transformer blocks (depth), the number of heads in the multi-head attention layer, and the dimension of the MLP (FeedForward) layer on model performance. We introduced saliency map visualizations to enhance interpretability and evaluate the critical regions contributing to classification decisions, ensuring the approach's transparency and robustness. Our experiments identified the optimal ViT configuration as follows: image size = 224 x 224, patch size = 16, embedding dimension = 512, depth = 6, number of heads = 8, and MLP dimension = 1024. This configuration achieved an impressive accuracy of 99.77% on the PlantVillage dataset. In addition, we incorporated a novel cross-dataset transferability evaluation to validate the generalizability of the proposed model. Comparative analysis with traditional convolutional neural network architectures, such as VGG19 and AlexNet, revealed that our optimized ViT model not only surpasses these models in accuracy but also requires significantly fewer trainable parameters and storage space. The incorporation of a lightweight, domain-specific fine-tuning process ensures the model's adaptability to new datasets with minimal computational overhead. Our findings highlight the scalability and adaptability of ViTs, emphasizing their ability to effectively handle varying image sizes and resolutions. Moreover, our approach outperforms recent state-of-the-art methods across multiple databases, underscoring the efficacy of the chosen ViT parameters.
引用
收藏
页码:48552 / 48570
页数:19
相关论文
共 50 条
  • [1] Inception convolutional vision transformers for plant disease identification
    Yu, Sheng
    Xie, Li
    Huang, Qilei
    INTERNET OF THINGS, 2023, 21
  • [2] Visual Intelligence in Precision Agriculture: Exploring Plant Disease Detection via Efficient Vision Transformers
    Parez, Sana
    Dilshad, Naqqash
    Alghamdi, Norah Saleh
    Alanazi, Turki M.
    Lee, Jong Weon
    SENSORS, 2023, 23 (15)
  • [3] BUViTNet: Breast Ultrasound Detection via Vision Transformers
    Ayana, Gelan
    Choe, Se-Woon
    DIAGNOSTICS, 2022, 12 (11)
  • [4] Artificial Cognition for Early Leaf Disease Detection using Vision Transformers
    Huy-Tan Thai
    Nhu-Y Tran-Van
    Kim-Hung Le
    2021 INTERNATIONAL CONFERENCE ON ADVANCED TECHNOLOGIES FOR COMMUNICATIONS (ATC 2021), 2021, : 33 - 38
  • [5] A Novel Diagnostic Framework with an Optimized Ensemble of Vision Transformers and Convolutional Neural Networks for Enhanced Alzheimer's Disease Detection in Medical Imaging
    Bortty, Joy Chakra
    Chakraborty, Gouri Shankar
    Noman, Inshad Rahman
    Batra, Salil
    Das, Joy
    Bishnu, Kanchon Kumar
    Tarafder, Md Tanvir Rahman
    Islam, Araf
    DIAGNOSTICS, 2025, 15 (06)
  • [6] Elasticnet-Based Vision Transformers for early detection of Parkinson's disease
    Ozdemir, Esra Yuzgec
    Ozyurt, Fatih
    BIOMEDICAL SIGNAL PROCESSING AND CONTROL, 2025, 101
  • [7] A cognitive vision method for the detection of plant disease images
    Chen, Junde
    Chen, Jinxiu
    Zhang, Defu
    Nanehkaran, Y. A.
    Sun, Yuandong
    MACHINE VISION AND APPLICATIONS, 2021, 32 (01)
  • [8] A cognitive vision method for the detection of plant disease images
    Junde Chen
    Jinxiu Chen
    Defu Zhang
    Y. A. Nanehkaran
    Yuandong Sun
    Machine Vision and Applications, 2021, 32
  • [9] Optimized vision transformer encoder with cnn for automatic psoriasis disease detection
    Vishwakarma, Gagan
    Nandanwar, Amit Kumar
    Thakur, Ghanshyam Singh
    MULTIMEDIA TOOLS AND APPLICATIONS, 2023, 83 (21) : 59597 - 59616
  • [10] Colonoscopy Landmark Detection Using Vision Transformers
    Tamhane, Aniruddha
    Mida, Tse'ela
    Posner, Erez
    Bouhnik, Moshe
    IMAGING SYSTEMS FOR GI ENDOSCOPY, AND GRAPHS IN BIOMEDICAL IMAGE ANALYSIS, ISGIE 2022, 2022, 13754 : 24 - 34