Optimized Vision Transformers for Superior Plant Disease Detection

被引:0
|
作者
Ouamane, Abdelmalik [1 ,2 ]
Chouchane, Ammar [2 ,3 ]
Himeur, Yassine [4 ]
Miniaoui, Sami [4 ]
Atalla, Shadi [4 ]
Mansoor, Wathiq [4 ]
Al-Ahmad, Hussain [4 ]
机构
[1] Univ Biskra, Lab LI3C, Biskra 07000, Algeria
[2] Agence Themat Rech Sci St ATRSS, Es Senia 31000, Algeria
[3] Univ Ctr Barika, Barika 05001, Algeria
[4] Univ Dubai, Coll Engn & Informat Technol, Dubai, U Arab Emirates
来源
IEEE ACCESS | 2025年 / 13卷
关键词
Plant disease detection; vision transformer; convolutional neural network; optimized ViT model; VGG 19 and AlexNet;
D O I
10.1109/ACCESS.2025.3547416
中图分类号
TP [自动化技术、计算机技术];
学科分类号
0812 ;
摘要
Detecting plant diseases is vital for maintaining agricultural productivity and ensuring food security. Advances in computer vision, particularly with Vision Transformers (ViTs), have shown significant potential in improving the accuracy and efficiency of plant disease identification. This study provides a comprehensive evaluation of various ViT parameters to determine the most effective configuration for this purpose. Using the extensive PlantVillage dataset, we systematically analyzed the effects of patch sizes, image resolutions, embedding dimensions, the number of transformer blocks (depth), the number of heads in the multi-head attention layer, and the dimension of the MLP (FeedForward) layer on model performance. We introduced saliency map visualizations to enhance interpretability and evaluate the critical regions contributing to classification decisions, ensuring the approach's transparency and robustness. Our experiments identified the optimal ViT configuration as follows: image size = 224 x 224, patch size = 16, embedding dimension = 512, depth = 6, number of heads = 8, and MLP dimension = 1024. This configuration achieved an impressive accuracy of 99.77% on the PlantVillage dataset. In addition, we incorporated a novel cross-dataset transferability evaluation to validate the generalizability of the proposed model. Comparative analysis with traditional convolutional neural network architectures, such as VGG19 and AlexNet, revealed that our optimized ViT model not only surpasses these models in accuracy but also requires significantly fewer trainable parameters and storage space. The incorporation of a lightweight, domain-specific fine-tuning process ensures the model's adaptability to new datasets with minimal computational overhead. Our findings highlight the scalability and adaptability of ViTs, emphasizing their ability to effectively handle varying image sizes and resolutions. Moreover, our approach outperforms recent state-of-the-art methods across multiple databases, underscoring the efficacy of the chosen ViT parameters.
引用
收藏
页码:48552 / 48570
页数:19
相关论文
共 50 条
  • [11] Vision transformers are active learners for image copy detection
    Tan, Zhentao
    Wang, Wenhao
    Shan, Caifeng
    NEUROCOMPUTING, 2024, 587
  • [12] Vision transformer meets convolutional neural network for plant disease classification
    Thakur, Poornima Singh
    Chaturvedi, Shubhangi
    Khanna, Pritee
    Sheorey, Tanuja
    Ojha, Aparajita
    ECOLOGICAL INFORMATICS, 2023, 77
  • [13] Detection of Alzheimer Disease in Neuroimages Using Vision Transformers: Systematic Review and Meta-Analysis
    Mubonanyikuzo, Vivens
    Yan, Hongjie
    Komolafe, Temitope Emmanuel
    Zhou, Liang
    Wu, Tao
    Wang, Nizhuan
    JOURNAL OF MEDICAL INTERNET RESEARCH, 2025, 27
  • [14] Vision Transformers for Brain Tumor Classification
    Simon, Eliott
    Briassouli, Alexia
    PROCEEDINGS OF THE 15TH INTERNATIONAL JOINT CONFERENCE ON BIOMEDICAL ENGINEERING SYSTEMS AND TECHNOLOGIES (BIOIMAGING), VOL 2, 2021, : 123 - 130
  • [15] Consistency Loss for Improved Colonoscopy Landmark Detection with Vision Transformers
    Tamhane, Aniruddha
    Dobkin, Daniel
    Shtalrid, Ore
    Bouhnik, Moshe
    Posner, Erez
    Mida, Tse'ela
    MACHINE LEARNING IN MEDICAL IMAGING, MLMI 2023, PT II, 2024, 14349 : 124 - 133
  • [16] Enhancing Skin Cancer Detection with Transfer Learning and Vision Transformers
    Ahmad, Istiak
    Alsulami, Bassma Saleh
    Alqurashi, Fahad
    INTERNATIONAL JOURNAL OF ADVANCED COMPUTER SCIENCE AND APPLICATIONS, 2024, 15 (10) : 1027 - 1034
  • [17] The Power of Vision Transformers and Acoustic Sensors for Cotton Pest Detection
    Remya, S.
    Anjali, T.
    Abhishek, S.
    Ramasubbareddy, Somula
    Cho, Yongyun
    IEEE OPEN JOURNAL OF THE COMPUTER SOCIETY, 2024, 5 : 356 - 367
  • [18] Medicinal Plant Leaf Classification using Deep Learning and Vision Transformers
    Hossain, Shahriar
    Hasan, Rizbanul
    Uddin, Jia
    BAGHDAD SCIENCE JOURNAL, 2025, 22 (03) : 1065 - 1076
  • [19] A Comparative Evaluation between Convolutional Neural Networks and Vision Transformers for COVID-19 Detection
    Nafisah, Saad I.
    Muhammad, Ghulam
    Hossain, M. Shamim
    AlQahtani, Salman A.
    MATHEMATICS, 2023, 11 (06)
  • [20] Remote Wildfire Detection using Multispectral Satellite Imagery and Vision Transformers
    Rad, Ryan
    ASIAN CONFERENCE ON MACHINE LEARNING, VOL 222, 2023, 222