Application of amodal segmentation on cucumber segmentation and occlusion recovery

被引:17
作者
Kim, Sungjay [1 ,2 ]
Hong, Suk-Ju [1 ,3 ]
Ryu, Jiwon [1 ,2 ]
Kim, Eungchan [1 ,2 ]
Lee, Chang-Hyup [1 ,2 ]
Kim, Ghiseok [1 ,2 ,3 ]
机构
[1] Seoul Natl Univ, Dept Biosyst Engn, 1 Gwanak ro, Seoul 08826, South Korea
[2] Seoul Natl Univ, Global Smart Farm Convergence Major, 1 Gwanak ro, Seoul 08826, South Korea
[3] Seoul Natl Univ, Res Inst Agr & Life Sci, 1 Gwanak ro, Seoul 08826, South Korea
关键词
Amodal segmentation; Autonomous harvesting; Greenhouse cucumber; Mask region-based convolutional neural; network; Occlusion recovery; Reconstruction network; RECOGNITION;
D O I
10.1016/j.compag.2023.107847
中图分类号
S [农业科学];
学科分类号
09 ;
摘要
Computer vision tasks, such as image recognition, object detection, and semantic segmentation, have contributed tremendously to autonomous harvesting. However, these tasks are applicable to the visible parts of an object in an image. In the field of agriculture, detecting green vegetables with subtle color differences poses an additional challenge. The nature of cucumbers (Cucumis sativus) makes them difficult to detect. To address this issue, we reconstructed the occluded part of the cucumber to help autonomous robots detect and locate picking-point positions. A dataset of cucumber images from two farms located in South Korea was generated. The dataset was superimposed with synthetic leaf patches to simulate the effect of occlusion. Using this dataset, we employed amodal segmentation with an auto-encoder and an ablation study regarding shape prior post-process, shape prior refinement, and feature matching to determine a suitable method for our cucumber dataset. We then proposed amodal segmentation with a U-net reconstruction network as a novel model for cucumber occlusion recovery. In the ablation study, the model with no additional process showed the highest accuracy, average precision (AP) of 49.31 and average precision with intersection over union (IoU) 0.5 (AP50) of 82.39, and the fastest inference time of 233 ms/image. Our proposed model outperformed auto-encoder-based models with an AP of 50.06, an AP50 of 82.43, and an inference time of 220 ms/image. The proposed method has been shown to be effective in improving the accuracy of cucumber segmentation under occlusion conditions. Therefore, amodal segmentation, particularly with the U-net reconstruction network, seems promising for the vision systems of cucumber picking robots. The main contribution of the study is that with amodal segmentation, detection of occluded cucumber instances can be done in a single stage with promising accuracy and speed, thus no longer requiring additional time-consuming operations of the manipulators in harvest decision making.
引用
收藏
页数:13
相关论文
共 41 条
[1]   OPENSURFACES: A Richly Annotated Catalog of Surface Appearance [J].
Bell, Sean ;
Upchurch, Paul ;
Snavely, Noah ;
Bala, Kavita .
ACM TRANSACTIONS ON GRAPHICS, 2013, 32 (04)
[2]   An image restoration and detection method for picking robot based on convolutional auto-encoder [J].
Chen, Jiqing ;
Zhang, Hongdu ;
Wang, Zhikui ;
Wu, Jiahua ;
Luo, Tian ;
Wang, Huabin ;
Long, Teng .
COMPUTERS AND ELECTRONICS IN AGRICULTURE, 2022, 196
[3]   Detecting ripe fruits under natural occlusion and illumination conditions [J].
Chen, Jiqing ;
Wu, Jiahua ;
Wang, Zhikui ;
Qiang, Hu ;
Cai, Ganwei ;
Tan, Chengzhi ;
Zhao, Chaoyang .
COMPUTERS AND ELECTRONICS IN AGRICULTURE, 2021, 190
[4]  
Dosovitskiy A, 2021, Arxiv, DOI [arXiv:2010.11929, DOI 10.48550/ARXIV.2010.11929]
[5]   Automatic Detection of Field-Grown Cucumbers for Robotic Harvesting [J].
Fernandez, Roemi ;
Montes, Hector ;
Surdilovic, Jelena ;
Surdilovic, Dragojlub ;
Gonzalez-De-Santos, Pablo ;
Armada, Manuel .
IEEE ACCESS, 2018, 6 :35512-35527
[6]   Unsupervised domain adaptation using transformers for sugarcane rows and gaps detection [J].
Ferreira, Alessandro dos Santos ;
Junior, Jose Marcato ;
Pistori, Hemerson ;
Melgani, Farid ;
Goncalves, Wesley Nunes .
COMPUTERS AND ELECTRONICS IN AGRICULTURE, 2022, 203
[7]  
Geirhos R., 2019, P INT C LEARNING REP
[8]   MR image reconstruction using deep learning: evaluation of network structure and loss functions [J].
Ghodrati, Vahid ;
Shao, Jiaxin ;
Bydder, Mark ;
Zhou, Ziwu ;
Yin, Wotao ;
Nguyen, Kim-Lien ;
Yan, Yingli ;
Hu, Peng .
QUANTITATIVE IMAGING IN MEDICINE AND SURGERY, 2019, 9 (09) :1516-1527
[9]  
Girshick R, 2015, Arxiv, DOI [arXiv:1504.08083, DOI 10.48550/ARXIV.1504.08083]
[10]  
Goodfellow I.J., 2014, arXiv, DOI [10.1145/3422622, DOI 10.1145/3422622]