FOTCA: hybrid transformer-CNN architecture using AFNO for accurate plant leaf disease image recognition

被引:5
作者
Hu, Bo [1 ]
Jiang, Wenqian [2 ]
Zeng, Juan [3 ]
Cheng, Chen [4 ]
He, Laichang [2 ]
机构
[1] Nanchang Univ, Sch Informat Engn, Nanchang, Peoples R China
[2] Nanchang Univ, Dept Radiol, Affiliated Hosp 1, Nanchang, Peoples R China
[3] Nanchang Univ, Clin Med Coll 2, Nanchang, Peoples R China
[4] Huazhong Univ Sci & Technol, Sch Math & Stat, Wuhan, Peoples R China
基金
中国国家自然科学基金;
关键词
plant leaf disease image recognition; hybrid architecture; transformer-based models; adaptive Fourier Neural Operator; deep learning;
D O I
10.3389/fpls.2023.1231903
中图分类号
Q94 [植物学];
学科分类号
071001 ;
摘要
Plants are widely grown around the world and have high economic benefits. plant leaf diseases not only negatively affect the healthy growth and development of plants, but also have a negative impact on the environment. While traditional manual methods of identifying plant pests and diseases are costly, inefficient and inaccurate, computer vision technologies can avoid these drawbacks and also achieve shorter control times and associated cost reductions. The focusing mechanism of Transformer-based models(such as Visual Transformer) improves image interpretability and enhances the achievements of convolutional neural network (CNN) in image recognition, but Visual Transformer(ViT) performs poorly on small and medium-sized datasets. Therefore, in this paper, we propose a new hybrid architecture named FOTCA, which uses Transformer architecture based on adaptive Fourier Neural Operators(AFNO) to extract the global features in advance, and further down sampling by convolutional kernel to extract local features in a hybrid manner. To avoid the poor performance of Transformer-based architecture on small datasets, we adopt the idea of migration learning to make the model have good scientific generalization on OOD (Out-of-Distribution) samples to improve the model's overall understanding of images. In further experiments, Focal loss and hybrid architecture can greatly improve the convergence speed and recognition accuracy of the model in ablation experiments compared with traditional models. The model proposed in this paper has the best performance with an average recognition accuracy of 99.8% and an F1-score of 0.9931. It is sufficient for deployment in plant leaf disease image recognition.
引用
收藏
页数:12
相关论文
共 30 条
[1]  
[Anonymous], 2017, 2017 21 INT COMP SCI, DOI [DOI 10.1109/ICSEC.2017.8443919, 10.1109/ICSEC.2017.8443919]
[2]   Birdsnap: Large-scale Fine-grained Visual Categorization of Birds [J].
Berg, Thomas ;
Liu, Jiongxin ;
Lee, Seung Woo ;
Alexander, Michelle L. ;
Jacobs, David W. ;
Belhumeur, Peter N. .
2014 IEEE CONFERENCE ON COMPUTER VISION AND PATTERN RECOGNITION (CVPR), 2014, :2019-2026
[3]   Deep convolutional neural network based plant species recognition through features of leaf [J].
Bisen, Dhananjay .
MULTIMEDIA TOOLS AND APPLICATIONS, 2021, 80 (04) :6443-6456
[4]   End-to-End Object Detection with Transformers [J].
Carion, Nicolas ;
Massa, Francisco ;
Synnaeve, Gabriel ;
Usunier, Nicolas ;
Kirillov, Alexander ;
Zagoruyko, Sergey .
COMPUTER VISION - ECCV 2020, PT I, 2020, 12346 :213-229
[5]  
Changli Cai, 2021, 2021 7th International Conference on Computer and Communications (ICCC), P863, DOI 10.1109/ICCC54389.2021.9674560
[6]   Destruction and Construction Learning for Fine-grained Image Recognition [J].
Chen, Yue ;
Bai, Yalong ;
Zhang, Wei ;
Mei, Tao .
2019 IEEE/CVF CONFERENCE ON COMPUTER VISION AND PATTERN RECOGNITION (CVPR 2019), 2019, :5152-5161
[7]  
Dosovitskiy A., 2021, arXiv
[8]   Look Closer to See Better: Recurrent Attention Convolutional Neural Network for Fine-grained Image Recognition [J].
Fu, Jianlong ;
Zheng, Heliang ;
Mei, Tao .
30TH IEEE CONFERENCE ON COMPUTER VISION AND PATTERN RECOGNITION (CVPR 2017), 2017, :4476-4484
[9]  
Guibas J., 2021, arXiv
[10]  
He J, 2022, AAAI CONF ARTIF INTE, P852