Comprehensive visual information acquisition for tomato picking robot based on multitask convolutional neural network

被引:9
作者
Du, Xiaoqiang [1 ,2 ,3 ]
Meng, Zhichao [1 ]
Ma, Zenghong [1 ,2 ,3 ]
Zhao, Lijun [4 ]
Lu, Wenwu [1 ]
Cheng, Hongchao [1 ]
Wang, Yawei [1 ]
机构
[1] Zhejiang Sci Tech Univ, Sch Mech Engn, Hangzhou 310018, Peoples R China
[2] Zhejiang Key Lab Transplanting Equipment & Techno, Hangzhou 310018, Peoples R China
[3] Collaborat Innovat Ctr Intelligent Prod Equipment, Hangzhou 310018, Peoples R China
[4] Chongqing Univ Arts & Sci, Coll Intelligent Mfg Engn, Chongqing 402160, Peoples R China
基金
中国国家自然科学基金;
关键词
YOLO-MCNN; Semantic segmentation; Multitask convolutional neural network; Tomato picking robot; SEGMENTATION;
D O I
10.1016/j.biosystemseng.2023.12.017
中图分类号
S2 [农业工程];
学科分类号
0828 ;
摘要
The tomato picking robot's vision system faces two difficult tasks: precise tomato pose acquisition and stem location. Tomato pose and stem location can help determine the end effector pose and achieve collision-free picking. To realise efficient crop picking, the tasks of target location, pose detection, and obstacle semantic segmentation should be completed in one model to obtain comprehensive visual information. Therefore, the multitask convolutional neural network YOLO-MCNN is proposed, a new method to complete the above tasks in one model. By fusing multi-scale features and determining the optimal locations for the semantic segmentation branch, four strategies are proposed for enhancing the segmentation ability. The experiment results show that fusing the semantic segmentation branch with the second layer of shallow feature maps and placing the branch after the 17th layer can result in the best segmentation performance. Fusing shallow feature maps improves small target detection while merging multi-scale feature maps enhances semantic segmentation performance. Moreover, ablation experiments are conducted to understand the influence between multitask convolutional and single task networks. It proves that running multiple tasks on the same backbone network does not affect their performance. The YOLO-MCNN's target detection performance F1 is 87.8%, the semantic segmentation performance mIoU is 74.8%, the keypoint detection performance dlmk is 6.95 pixels, the network size is 15.2 MB, and the inference speed is 19.9ms. Compared with other target detection and semantic segmentation networks, it shows that the comprehensive performance of the YOLO-MCNN is the best. The method provides theoretical foundation for constructing multitask convolutional neural networks.
引用
收藏
页码:51 / 61
页数:11
相关论文
共 26 条
[1]   Semantic segmentation: A modern approach for identifying soil clods in precision farming [J].
Azizi, Afshin ;
Abbaspour-Gilandeh, Yousef ;
Vannier, Edwige ;
Dusseaux, Richard ;
Mseri-Gundoshmian, Tarahom ;
Moghaddam, Hamid Abrishami .
BIOSYSTEMS ENGINEERING, 2020, 196 :172-182
[2]  
Ba J, 2014, ACS SYM SER
[3]  
Bochkovskiy A, 2020, PREPRINT, DOI 10.48550/ARXIV.2004.10934
[4]   Real-time, highly accurate robotic grasp detection utilizing transfer learning for robots manipulating fragile fruits with widely variable sizes and shapes [J].
Cao, Boyuan ;
Zhang, Baohua ;
Zheng, Wei ;
Zhou, Jun ;
Lin, Yihuan ;
Chen, Yuxin .
COMPUTERS AND ELECTRONICS IN AGRICULTURE, 2022, 200
[5]  
Chen LB, 2017, IEEE INT SYMP NANO, P1, DOI 10.1109/NANOARCH.2017.8053709
[6]   Wing Loss for Robust Facial Landmark Localisation with Convolutional Neural Networks [J].
Feng, Zhen-Hua ;
Kittler, Josef ;
Awais, Muhammad ;
Huber, Patrik ;
Wu, Xiao-Jun .
2018 IEEE/CVF CONFERENCE ON COMPUTER VISION AND PATTERN RECOGNITION (CVPR), 2018, :2235-2245
[7]  
He KM, 2020, IEEE T PATTERN ANAL, V42, P386, DOI [10.1109/ICCV.2017.322, 10.1109/TPAMI.2018.2844175]
[8]   Searching for MobileNetV3 [J].
Howard, Andrew ;
Sandler, Mark ;
Chu, Grace ;
Chen, Liang-Chieh ;
Chen, Bo ;
Tan, Mingxing ;
Wang, Weijun ;
Zhu, Yukun ;
Pang, Ruoming ;
Vasudevan, Vijay ;
Le, Quoc V. ;
Adam, Hartwig .
2019 IEEE/CVF INTERNATIONAL CONFERENCE ON COMPUTER VISION (ICCV 2019), 2019, :1314-1324
[9]   A Real-Time Apple Targets Detection Method for Picking Robot Based on ShufflenetV2-YOLOX [J].
Ji, Wei ;
Pan, Yu ;
Xu, Bo ;
Wang, Juncheng .
AGRICULTURE-BASEL, 2022, 12 (06)
[10]   Integrating machine vision-based row guidance with GPS and compass-based routing to achieve autonomous navigation for a rice field weeding robot [J].
Kanagasingham, Sabeethan ;
Ekpanyapong, Mongkol ;
Chaihan, Rachan .
PRECISION AGRICULTURE, 2020, 21 (04) :831-855