Lightweight and efficient deep learning models for fruit detection in orchards

被引:5
作者
Yang, Xiaoyao [1 ]
Zhao, Wenyang [1 ]
Wang, Yong [1 ]
Yan, Wei Qi [2 ]
Li, Yanqiang [1 ]
机构
[1] Qilu Univ Technol, Inst Automat, Shandong Acad Sci, Jinan 250014, Peoples R China
[2] Auckland Univ Technol, Auckland, New Zealand
关键词
Recognition of apple; Lightweight network; Attention mechanism; Object detection; Deep learning;
D O I
10.1038/s41598-024-76662-w
中图分类号
O [数理科学和化学]; P [天文学、地球科学]; Q [生物科学]; N [自然科学总论];
学科分类号
07 ; 0710 ; 09 ;
摘要
The accurate recognition of apples in complex orchard environments is a fundamental aspect of the operation of automated picking equipment. This paper aims to investigate the influence of dense targets, occlusion, and the natural environment in practical application scenarios. To this end, it constructs a fruit dataset containing different scenarios and proposes a real-time lightweight detection network, ELD(Efficient Lightweight object Detector). The EGSS(Efficient Ghost-shuffle Slim module) module and MCAttention(Mix channel Attention) are proposed as innovative solutions to the problems of feature extraction and classification. The attention mechanism is employed to construct a novel feature extraction network, which effectively utilizes the low-latitude feature information, significantly enhances the fine-grained feature information and gradient flow of the model, and improves the model's anti-interference ability. Eliminate redundant channels with SlimPAN to further compress the network and optimise functionality. The network as a whole employs the Shape-IOU loss function, which considers the influence of the bounding box itself, thereby enhancing the robustness of the model. Finally, the target detection accuracy is enhanced through the transfer of knowledge from the teacher's network through knowledge distillation, while ensuring that the overall network is sufficiently lightweight. The experimental results demonstrate that the ELD network, designed for fruit detection, achieves an accuracy of 87.4%. It has a relatively low number of parameters (4.3x105\documentclass[12pt]{minimal} \usepackage{amsmath} \usepackage{wasysym} \usepackage{amsfonts} \usepackage{amssymb} \usepackage{amsbsy} \usepackage{mathrsfs} \usepackage{upgreek} \setlength{\oddsidemargin}{-69pt} \begin{document}$$4.3 \times 10<^>5$$\end{document}), a GLOPs of only 1.7, and a high FPS of 156. This network can achieve high accuracy while consuming fewer computational resources and performing better than other networks.
引用
收藏
页数:20
相关论文
共 61 条
[1]  
Alexey Bochkovskiy C.-Y.W., 2023, Yolov8
[2]  
Alexey Bochkovskiy H.-Y. M.L., 2021, Yolov5
[3]  
Bochkovskiy A, 2020, Arxiv, DOI [arXiv:2004.10934, 10.48550/arXiv.2004.10934, DOI 10.48550/ARXIV.2004.10934]
[4]   End-to-End Object Detection with Transformers [J].
Carion, Nicolas ;
Massa, Francisco ;
Synnaeve, Gabriel ;
Usunier, Nicolas ;
Kirillov, Alexander ;
Zagoruyko, Sergey .
COMPUTER VISION - ECCV 2020, PT I, 2020, 12346 :213-229
[5]   Run, Don't Walk: Chasing Higher FLOPS for Faster Neural Networks [J].
Chen, Jierun ;
Kao, Shiu-Hong ;
He, Hao ;
Zhuo, Weipeng ;
Wen, Song ;
Lee, Chul-Ho ;
Chan, S. -H. Gary .
2023 IEEE/CVF CONFERENCE ON COMPUTER VISION AND PATTERN RECOGNITION (CVPR), 2023, :12021-12031
[6]  
Chen X., 2021, YOLOv5-Lite: Lighter, faster and easier to deploy
[7]   Visual-Saliency-Guided Channel Pruning for Deep Visual Detectors in Autonomous Driving [J].
Choi, Jung Im ;
Tian, Qing .
2023 IEEE INTELLIGENT VEHICLES SYMPOSIUM, IV, 2023,
[8]   Xception: Deep Learning with Depthwise Separable Convolutions [J].
Chollet, Francois .
30TH IEEE CONFERENCE ON COMPUTER VISION AND PATTERN RECOGNITION (CVPR 2017), 2017, :1800-1807
[9]   Dual Attention Network for Scene Segmentation [J].
Fu, Jun ;
Liu, Jing ;
Tian, Haijie ;
Li, Yong ;
Bao, Yongjun ;
Fang, Zhiwei ;
Lu, Hanqing .
2019 IEEE/CVF CONFERENCE ON COMPUTER VISION AND PATTERN RECOGNITION (CVPR 2019), 2019, :3141-3149
[10]  
Gevorgyan Z, 2022, Arxiv, DOI arXiv:2205.12740