Optimized Vision Transformers for Superior Plant Disease Detection

被引:0
|
作者
Ouamane, Abdelmalik [1 ,2 ]
Chouchane, Ammar [2 ,3 ]
Himeur, Yassine [4 ]
Miniaoui, Sami [4 ]
Atalla, Shadi [4 ]
Mansoor, Wathiq [4 ]
Al-Ahmad, Hussain [4 ]
机构
[1] Univ Biskra, Lab LI3C, Biskra 07000, Algeria
[2] Agence Themat Rech Sci St ATRSS, Es Senia 31000, Algeria
[3] Univ Ctr Barika, Barika 05001, Algeria
[4] Univ Dubai, Coll Engn & Informat Technol, Dubai, U Arab Emirates
来源
IEEE ACCESS | 2025年 / 13卷
关键词
Plant disease detection; vision transformer; convolutional neural network; optimized ViT model; VGG 19 and AlexNet;
D O I
10.1109/ACCESS.2025.3547416
中图分类号
TP [自动化技术、计算机技术];
学科分类号
0812 ;
摘要
Detecting plant diseases is vital for maintaining agricultural productivity and ensuring food security. Advances in computer vision, particularly with Vision Transformers (ViTs), have shown significant potential in improving the accuracy and efficiency of plant disease identification. This study provides a comprehensive evaluation of various ViT parameters to determine the most effective configuration for this purpose. Using the extensive PlantVillage dataset, we systematically analyzed the effects of patch sizes, image resolutions, embedding dimensions, the number of transformer blocks (depth), the number of heads in the multi-head attention layer, and the dimension of the MLP (FeedForward) layer on model performance. We introduced saliency map visualizations to enhance interpretability and evaluate the critical regions contributing to classification decisions, ensuring the approach's transparency and robustness. Our experiments identified the optimal ViT configuration as follows: image size = 224 x 224, patch size = 16, embedding dimension = 512, depth = 6, number of heads = 8, and MLP dimension = 1024. This configuration achieved an impressive accuracy of 99.77% on the PlantVillage dataset. In addition, we incorporated a novel cross-dataset transferability evaluation to validate the generalizability of the proposed model. Comparative analysis with traditional convolutional neural network architectures, such as VGG19 and AlexNet, revealed that our optimized ViT model not only surpasses these models in accuracy but also requires significantly fewer trainable parameters and storage space. The incorporation of a lightweight, domain-specific fine-tuning process ensures the model's adaptability to new datasets with minimal computational overhead. Our findings highlight the scalability and adaptability of ViTs, emphasizing their ability to effectively handle varying image sizes and resolutions. Moreover, our approach outperforms recent state-of-the-art methods across multiple databases, underscoring the efficacy of the chosen ViT parameters.
引用
收藏
页码:48552 / 48570
页数:19
相关论文
共 50 条
  • [41] Fire detection using vision transformer on power plant
    Zhang, Kaidi
    Wang, Binjun
    Tong, Xin
    Liu, Keke
    ENERGY REPORTS, 2022, 8 : 657 - 664
  • [42] ViT-SmartAgri: Vision Transformer and Smartphone-Based Plant Disease Detection for Smart Agriculture
    Barman, Utpal
    Sarma, Parismita
    Rahman, Mirzanur
    Deka, Vaskar
    Lahkar, Swati
    Sharma, Vaishali
    Saikia, Manob Jyoti
    AGRONOMY-BASEL, 2024, 14 (02):
  • [43] DynaSlim: Dynamic Slimming for Vision Transformers
    Shi, Da
    Gao, Jingsheng
    Liu, Ting
    Fu, Yuzhuo
    2023 IEEE INTERNATIONAL CONFERENCE ON MULTIMEDIA AND EXPO, ICME, 2023, : 1451 - 1456
  • [44] Vision Transformers for Single Image Dehazing
    Song, Yuda
    He, Zhuqing
    Qian, Hui
    Du, Xin
    IEEE TRANSACTIONS ON IMAGE PROCESSING, 2023, 32 : 1927 - 1941
  • [45] Mitigation of spatial nonstationarity with vision transformers
    Liu, Lei
    Santos, Javier E.
    Prodanovic, Masa
    Pyrcz, Michael J.
    COMPUTERS & GEOSCIENCES, 2023, 178
  • [46] Towards improved fundus disease detection using Swin Transformers
    Jawad, M. Abdul
    Khursheed, Farida
    Nawaz, Shah
    Mir, A. H.
    MULTIMEDIA TOOLS AND APPLICATIONS, 2024, 83 (32) : 78125 - 78159
  • [47] Vision Transformers in Image Restoration: A Survey
    Ali, Anas M.
    Benjdira, Bilel
    Koubaa, Anis
    El-Shafai, Walid
    Khan, Zahid
    Boulila, Wadii
    SENSORS, 2023, 23 (05)
  • [48] Vision Transformers in medical computer vision-A contemplative retrospection
    Parvaiz, Arshi
    Khalid, Muhammad Anwaar
    Zafar, Rukhsana
    Ameer, Huma
    Ali, Muhammad
    Fraz, Muhammad Moazam
    ENGINEERING APPLICATIONS OF ARTIFICIAL INTELLIGENCE, 2023, 122
  • [49] Utilizing convolutional neural networks and vision transformers for precise corn leaf disease identification
    Ishak Pacal
    Gültekin Işık
    Neural Computing and Applications, 2025, 37 (4) : 2479 - 2496
  • [50] Depth-Based Intervention Detection in the Neonatal Intensive Care Unit Using Vision Transformers
    Hajj-Ali, Zein
    Dosso, Yasmina Souley
    Greenwood, Kim
    Harrold, Joann
    Green, James R.
    SENSORS, 2024, 24 (23)