Medicinal Plant Leaf Classification using Deep Learning and Vision Transformers

被引：0

作者：

Hossain, Shahriar ^{[1
]}

Hasan, Rizbanul ^{[1
]}

Uddin, Jia ^{[2
]}

机构：

[1] George Mason Univ, Dept Comp Sci, Fairfax, VA USA

[2] Woosong Univ, Endicott Coll, Dept Artificial Intelligence & Big Data, Daejeon, South Korea

来源：

BAGHDAD SCIENCE JOURNAL | 2025年 / 22卷 / 03期

关键词：

Classification; CNN-ViT; Medicinal plant leaf; Transfer learning; Vision transformer;

D O I：

10.21123/bsj.2024.10844

中图分类号：

O [数理科学和化学]; P [天文学、地球科学]; Q [生物科学]; N [自然科学总论];

学科分类号：

07 ; 0710 ; 09 ;

摘要：

Identification of medicinal plant leaves is very crucial as their cultivation and production are essential for the medicine industry. Many different classes of medicinal leaves look identical but serve different purposes in the medicine industry and have different remedies for different diseases. Hence it is imperative to use methods that are automated, faster, and produce good accuracy. Cutting-edge models have been trained to discern the subtle distinctions between various species of leaves, accounting for a myriad of factors such as leaf texture, shape, and color variations, which are often imperceptible to the human eye. In this research, Transfer learning (TL) based VGG16 and Vision Transformer (ViT) models such as ConvMixer and Compact Convolutional Transformer (CCT) are implemented for the classification of medicinal leaf images using a dataset of 38066 leaf images having 10 different classes. The proposed customized Convolutional Neural Network (CNN) and hybrid CNN-ViT models both have a very low number of parameters compared to the other models in comparison making them light and capable of being less computationally expensive. In the experimental evaluation, all the results are collected for 30 epochs. VGG16, CCT, and ConvMixer produce AUC scores of 0.50, 0.79, and 0.50, respectively for the dataset while the proposed CNN and hybrid model gave AUC scores of 0.83 and 0.74, respectively. In addition, a hybrid denoising approach with Wavelet thresholding and Gaussian blurring is utilized to minimize the noises in the images by retaining the original image quality.

引用

页码：1065 / 1076

页数：13

共 18 条

[1] COVID-19 Diagnosis System using SimpNet Deep Model [J].

Abdullah, Tarza Hasan ;

Alizadeh, Fattah ;

Abdullah, Berivan Hasan .

BAGHDAD SCIENCE JOURNAL, 2022, 19 (05) :1078-1089

[2] DOLG-NeXt: Convolutional neural network with deep orthogonal fusion of local and global features for biomedical image segmentation [J].

Ahmed, Md. Rayhan ;

Fahim, Asif Iqbal ;

Islam, A. K. M. Muzahidul ;

Islam, Salekul ;

Shatabda, Swakkhar .

NEUROCOMPUTING, 2023, 546

[3]

Al Jumah A, 2013, JSIP, V4, P33, DOI [10.4236/jsip.2013.41004, DOI 10.4236/JSIP.2013.41004]

[4]

Chanyal H., 2022, Int J Int Sys App Eng., V10, P78

[5]

Chithra K, 2017, 2017 IEEE INTERNATIONAL CONFERENCE ON POWER, CONTROL, SIGNALS AND INSTRUMENTATION ENGINEERING (ICPCSI), P1460, DOI 10.1109/ICPCSI.2017.8391954

[6] Gaussian Blurring Technique for Detecting and Classifying Acute Lymphoblastic Leukemia Cancer Cells from Microscopic Biopsy Images [J].