PMVT: a lightweight vision transformer for plant disease identification on mobile devices

被引：39

作者：

Li, Guoqiang ^{[1
]}

Wang, Yuchao ^{[2
,3
]}

Zhao, Qing ^{[1
]}

Yuan, Peiyan ^{[2
,3
]}

Chang, Baofang ^{[2
,3
]}

机构：

[1] Henan Acad Agr Sci, Inst Agr Econ & Informat, Zhengzhou, Henan, Peoples R China

[2] Henan Normal Univ, Coll Comp & Informat Engn, Xinxiang, Henan, Peoples R China

[3] Key Lab Artificial Intelligence & Personalized Lea, Xinxiang, Henan, Peoples R China

来源：

FRONTIERS IN PLANT SCIENCE | 2023年 / 14卷

关键词：

plant disease identification; vision transformer; lightweight model; attention module; APP;

D O I：

10.3389/fpls.2023.1256773

中图分类号：

Q94 [植物学];

学科分类号：

071001 ;

摘要：

Due to the constraints of agricultural computing resources and the diversity of plant diseases, it is challenging to achieve the desired accuracy rate while keeping the network lightweight. In this paper, we proposed a computationally efficient deep learning architecture based on the mobile vision transformer (MobileViT) for real-time detection of plant diseases, which we called plant-based MobileViT (PMVT). Our proposed model was designed to be highly accurate and low-cost, making it suitable for deployment on mobile devices with limited resources. Specifically, we replaced the convolution block in MobileViT with an inverted residual structure that employs a 7x7 convolution kernel to effectively model long-distance dependencies between different leaves in plant disease images. Furthermore, inspired by the concept of multi-level attention in computer vision tasks, we integrated a convolutional block attention module (CBAM) into the standard ViT encoder. This integration allows the network to effectively avoid irrelevant information and focus on essential features. The PMVT network achieves reduced parameter counts compared to alternative networks on various mobile devices while maintaining high accuracy across different vision tasks. Extensive experiments on multiple agricultural datasets, including wheat, coffee, and rice, demonstrate that the proposed method outperforms the current best lightweight and heavyweight models. On the wheat dataset, PMVT achieves the highest accuracy of 93.6% using approximately 0.98 million (M) parameters. This accuracy is 1.6% higher than that of MobileNetV3. Under the same parameters, PMVT achieved an accuracy of 85.4% on the coffee dataset, surpassing SqueezeNet by 2.3%. Furthermore, out method achieved an accuracy of 93.1% on the rice dataset, surpassing MobileNetV3 by 3.4%. Additionally, we developed a plant disease diagnosis app and successfully used the trained PMVT model to identify plant disease in different scenarios.

引用

页数：12

共 33 条

[1] A survey on using deep learning techniques for plant disease diagnosis and recommendations for development of appropriate tools [J].

Ahmad, Aanis ;

Saraswat, Dharmendra ;

El Gamal, Aly .

SMART AGRICULTURAL TECHNOLOGY, 2023, 3

[2] Plant disease classification using deep learning [J].

Akshai, K. P. ;

Anitha, J. .

ICSPC'21: 2021 3RD INTERNATIONAL CONFERENCE ON SIGNAL PROCESSING AND COMMUNICATION (ICPSC), 2021, :407-411

[3] Lightweight convolutional neural network model for field wheat ear disease identification [J].

Bao, Wenxia ;

Yang, Xinghua ;

Liang, Dong ;

Hu, Gensheng ;

Yang, Xianjun .

COMPUTERS AND ELECTRONICS IN AGRICULTURE, 2021, 189

[4] Mobile-Former: Bridging MobileNet and Transformer [J].

Chen, Yinpeng ;

Dai, Xiyang ;

Chen, Dongdong ;

Liu, Mengchen ;

Dong, Xiaoyi ;

Yuan, Lu ;

Liu, Zicheng .

2022 IEEE/CVF CONFERENCE ON COMPUTER VISION AND PATTERN RECOGNITION (CVPR 2022), 2022, :5260-5269

[5]

Dosovitskiy A, 2021, Arxiv, DOI arXiv:2010.11929

[6]

Han K, 2021, ADV NEUR IN

[7] Review of the State of the Art of Deep Learning for Plant Diseases: A Broad Analysis and Discussion [J].

Hasan, Reem Ibrahim ;

Yusuf, Suhaila Mohd ;

Alzubaidi, Laith .

PLANTS-BASEL, 2020, 9 (10) :1-25

[8] Deep Residual Learning for Image Recognition [J].

He, Kaiming ;

Zhang, Xiangyu ;

Ren, Shaoqing ;

Sun, Jian .

2016 IEEE CONFERENCE ON COMPUTER VISION AND PATTERN RECOGNITION (CVPR), 2016, :770-778

[9] Searching for MobileNetV3 [J].

Howard, Andrew ;

Sandler, Mark ;

Chu, Grace ;

Chen, Liang-Chieh ;

Chen, Bo ;

Tan, Mingxing ;

Wang, Weijun ;

Zhu, Yukun ;

Pang, Ruoming ;

Vasudevan, Vijay ;

Le, Quoc V. ;

Adam, Hartwig .

2019 IEEE/CVF INTERNATIONAL CONFERENCE ON COMPUTER VISION (ICCV 2019), 2019, :1314-1324

[10] Transformers in Vision: A Survey [J].

Khan, Salman ;

Naseer, Muzammal ;

Hayat, Munawar ;

Zamir, Syed Waqas ;

Khan, Fahad Shahbaz ;

Shah, Mubarak .

ACM COMPUTING SURVEYS, 2022, 54 (10S)

← 1 2 3 4 →