Segmenting banana images using the lightweight UNet of multi-scale serial dilated convolution

被引:0
作者
Zhu L. [1 ]
Wu R. [1 ]
Fu G. [2 ]
Zhang S. [1 ]
Yang C. [1 ]
Chen T. [2 ]
Huang P. [2 ]
机构
[1] School of Electro Mechanical Engineering, Zhongkai University of Agriculture and Engineering, Guangzhou
[2] School of Automation, Zhongkai University of Agriculture and Engineering, Guangzhou
来源
Nongye Gongcheng Xuebao/Transactions of the Chinese Society of Agricultural Engineering | 2022年 / 38卷 / 13期
关键词
banana detection; lightweight UNet; multi-scale series dilated convolution; semantic segmentation;
D O I
10.11975/j.issn.1002-6819.2022.13.022
中图分类号
学科分类号
摘要
An efficient and accurate fruit detection has been widely used as one of the most important tasks in the field operation of agricultural robots. However, a great challenge is still remained on the complex background and the uncertainty of orchard environment, due to the similarity or occlusion between fruit and branches. The traditional UNet network cannot fully meet the banana recognition and picking system, such as the low real-time performance, the large number of parameters, and spatial information loss after down sampling. In this study, a lightweight UNet was proposed for the banana image segmentation using multi-scale serial dilated convolution. The coding layer of UNet network was firstly used to visualize the feature map. The UNet network was utilized to extract the similar features for many times, in order to better identify the edge texture of the targets, and color features. Therefore, the lightweight backbone of feature module was constructed as the Concat Depth Wise Block (CDWBlock). The effective feature map was obtained in the module using cheaper operation. As such, the number of model parameters and computation were reduced significantly, without losing feature extraction ability of the network. There were also some specific characteristics in a banana orchard, including the complex background, the large cluster of banana fruit, the small stalk of banana fruit, and the color low contrast. In the neural network, the filter with the wide receptive field was easier to identify the large object, while the narrow was easier to identify the small object. However, the actual UNet segmentation network was difficult to concurrently consider the large objects (such as the banana fruit string) and small objects (such as the banana fruit stalk, or the irregular edges of banana fruit string), particularly for a single type of receptive field filter. The information of small objects was normally easy to be lost in the down- and up-sampling operations. Therefore, a group of sawtooth wave-like dilated convolution was proposed with the expansion rate of [2, 1, 2] to increase the receptive field for the high sensitivity to data. The banana segmentation dataset consisted of 3000 images, which were divided into 2400, 300 and 300 images in the training set, the verification set, and the test set, respectively. The training strategy was adopted as the dynamic adjustment of learning rate. Once the Loss value did not decrease for 10 epoch times, the learning rate was reduced by 10 times. Meanwhile, the Loss function was designed to combine the Dice Loss and binary cross entropy Loss. Experiments show that the number of the network parameters was 0.45 Million, the recognition and segmentation speed reached 41.0 frame/s, while the mean pixel accuracy and mean intersection over union reached 97.32%, and 92.57%, respectively. Correspondingly, the expansion rate of [2, 1, 2] was selected for the excellent segmentation performance at both edge and stalk of banana fruit. The improved model performed the higher precision and fewer parameters than others. The better balance was achieved between the precision and speed of model. Therefore, the better recognition and response speed were gained in the banana orchard, while the dataset were only a few images. The finding can provide the technology support of visual recognition for the intelligent banana picking robots. The improved model can also be easily transferred to the subsequent applications, such as 3D reconstruction of agricultural targets, 3D positioning of banana fruit, and motion planning of agricultural picking robots. © 2022 Chinese Society of Agricultural Engineering. All rights reserved.
引用
收藏
页码:194 / 201
页数:7
相关论文
共 31 条
[1]  
(2021)
[2]  
Kootstra G, Wang X, Blok P M, Et al., Selective harvesting robotics: Current research, trends, and future directions, Current Robotics Reports, 2, 1, pp. 95-104, (2021)
[3]  
Zhao Y, Gong L, Huang Y, Et al., A review of key techniques of vision-based control for harvesting robot, Computers and Electronics in Agriculture, 127, pp. 311-323, (2016)
[4]  
Lv J, Wang Y, Xu L, Et al., A method to obtain the near-large fruit from apple image in orchard for single-arm apple harvesting robot, Scientia Horticulturae, 257, (2019)
[5]  
Li B, Long Y, Song H., Detection of green apples in natural scenes based on saliency theory and Gaussian curve fitting, International Journal of Agricultural and Biological Engineering, 11, 1, pp. 192-198, (2018)
[6]  
Huang Xiaoyu, Li Guanglin, Ma Chi, Et al., Green peach recognition based on improved discriminative regional feature integration algorithm in similar background, Transactions of the Chinese Society of Agricultural Engineering (Transactions of the CSAE)Design, 34, 23, pp. 142-148, (2018)
[7]  
Wang D, He D, Song H, Et al., Combining SUN-based visual attention model and saliency contour detection algorithm for apple image segmentation, Multimedia Tools and Applications, 78, 13, pp. 17391-17411, (2019)
[8]  
Wu Pei, Tu Lifen, Peng Qi, Banana recognition based on shape fitting and color fuzzy evaluation, Software Guide, 15, 5, pp. 191-192, (2016)
[9]  
Hu M, Dong Q, Liu B, Et al., The potential of double K-means clustering for banana image segmentation, Journal of Food Process Engineering, 37, 1, pp. 10-18, (2014)
[10]  
Surya Prabha D, Satheesh Kumar J., Assessment of banana fruit maturity by image processing technique, Journal of Food Science and Technology, 52, 3, pp. 1316-1327, (2015)