Self-supervised pose estimation method for a mobile robot in greenhouse

被引:0
作者
Zhou Y. [1 ]
Xu T. [1 ]
Deng H. [1 ]
Miao T. [1 ]
Wu Q. [1 ]
机构
[1] College of Information and Electrical Engineering, Shenyang Agricultural University, Shenyang
来源
Nongye Gongcheng Xuebao/Transactions of the Chinese Society of Agricultural Engineering | 2021年 / 37卷 / 09期
关键词
Convolutional neural network; Deep learning; Greenhouse; Navigation; Pose estimation; Robots; Self-supervised learning; Visual odometry;
D O I
10.11975/j.issn.1002-6819.2021.09.030
中图分类号
学科分类号
摘要
Simultaneous localization and mapping (SLAM) play a vital role in implementing autonomous navigation of mobile robots in an unknown environment. Especially, visual odometry (VO) is a core component for a localization module in the SLAM system. The pose and velocity of a robot can, therefore, be estimated using computational geometry. Furthermore, the learning-based VO has gained great success in joint estimation camera ego-motion and depth from videos. In this study, a novel self-supervised VO model was proposed to realize the autonomous operation of a mobile robot in a greenhouse. The consistency constraint of temporal depth was also introduced for the learning framework using the binocular baseline supervision. Stereo video sequences were selected to train the model. The pose network after training was then used for pose estimation. A pre-test found that the stillness between video frames caused the prediction value of the model to shrink. Therefore, a soft mask was used in photometric re-projection error to remove the static region from the apparent difference measurement, and the non-rigidity scene and occlusion were further solved with normalized mask planes. Meanwhile, a new type of star dilated convolution (SDC) was also designed, where the filter was used to extract image features from the center 3×3 solid kernel and eight directions of 1-D kernel. The computational cost of SDC was thus less than that of the regular convolution of the same receptive field. Moreover, SDC was superimposed on spatial dimensions using depth-wise convolution with different dilation rates, particularly without the necessary to modify the existing deep learning framework. A convolutional auto-encoder (CAE) with residual network architecture was constructed using the SDC and inverse residual module (IRM), further serving as the backbone network for the VO model. With the aid of a binocular camera, the video sequences were collected in the solar greenhouses with tomato as the crop. The stereo video dataset was constructed to carry out the training and testing experiments. The static samples of the video sequence were removed from the image apparent difference measurement with a soft mask. The results showed that the mean relative errors (MREs) of translation and rotation estimation in the model were cut down by 5.06 and 11.05 percent point, respectively, while the mean square root errors (RMSE) were reduced by 24.78% and 30.65%, respectively. Once a normalized mask plane was utilized in the model to deal with non-rigidity scenes and occlusion, the MREs of translation and rotation estimation were reduced by 4.15 and 3.86 percent point, respectively. It inferred that both masks significantly improved the accuracy of the model. Meanwhile, the SDC-based IRM (SDC-IRM) reduced the MRE of rotation by 7.54 percent point under the unchanged network parameters. Since the SDC-IRM structure presented significant effectiveness in reducing model error, the increase of perceptive field was an effective way to improve the accuracy of the model. The RMSE of rotation estimation were reduced by 36.48%, respectively, whereas, the mean cumulative rotation error per hundred frames (MCRE) decreased by 54.75%, when the consistency constraint of temporal depth was used in the model, indicating high accuracy and stability of pose estimation. The MRE of rotation estimation was reduced by 7.30 percent point when extending the expansion factor of IRMs. The data demonstrated that the increase of receptive fields in the SDC kernel contributed to the higher accuracy of rotation estimation. Nevertheless, there was no longer obvious improvement of the model, when the maximum dilation rate was more than 6. More importantly, the calculation speed was up to 56.5 frames per second in the final pose estimation network. The MREs of translation and rotation estimation were 8.29% and 5.71%, respectively. The pose estimation performed better, compared with the previous VO model under similar input settings. This finding can provide sound support for the design of the navigation system for mobile robots in a greenhouse. © 2021, Editorial Department of the Transactions of the Chinese Society of Agricultural Engineering. All right reserved.
引用
收藏
页码:263 / 274
页数:11
相关论文
共 36 条
[1]  
Mur-Artal R, Tardos J D., ORB-SLAM2: An open-source SLAM system for monocular, stereo and RGB-D cameras, IEEE Transactions on Robotics, 33, 5, pp. 1255-1262, (2017)
[2]  
Li Liang, Zhang Wen'ai Feng Qingchun, Et al., System design for rail spraying robot in greenhouse, Journal of Agricultural Mechanization Research, 38, 1, pp. 109-112, (2016)
[3]  
Yuan Ting, Ren Yongxin, Li Wei, Et al., Navigation information acquisition based on illumination chromaticity stability analysis for greenhouse robot, Transactions of the Chinese Society for Agricultural Machinery, 43, 10, pp. 161-166, (2012)
[4]  
Gao Guoqin, Li Ming, Navigating path recognition for greenhouse mobile robot based on K-means algorithm, Transactions of the Chinese Society of Agricultural Engineering (Transactions of the CSAE), 30, 7, pp. 25-33, (2014)
[5]  
Ju Jin, Liu Jizhan, Li Nan, Et al., Curb-following detection and navigation of greenhouse vehicle based on arc array of photoelectric switches, Transactions of the Chinese Society of Agricultural Engineering (Transactions of the CSAE), 33, 18, pp. 180-187, (2017)
[6]  
Shi Bing, Duan Suolin, Li Ju, Et al., Research on construction of composite grid map for mobile robot in greenhouse, Application Research of Computers, 36, 3, pp. 191-195, (2019)
[7]  
Masuzawa H, Miura J, Oishi S., Development of a mobile robot for harvest support in greenhouse horticulture: Person following and mapping, IEEE/SICE International Symposium on System Integration (SII), (2017)
[8]  
Hou Jialin, Pu Wenyang, Li Tianhua, Et al., Development of dual-lidar navigation system for greenhouse transportation robot, Transactions of the Chinese Society of Agricultural Engineering (Transactions of the CSAE), 36, 14, pp. 80-88, (2020)
[9]  
Davison A J, Reid I D, Molton N D, Et al., MonoSLAM: Real-time single camera SLAM, IEEE Transactions on Pattern Analysis and Machine Intelligence, 29, 6, pp. 1052-1067, (2007)
[10]  
Gee A P, Chekhlov D, Calway A, Et al., Discovering higher level structure in visual SLAM, IEEE Transactions on Robotics, 24, 5, pp. 980-990, (2008)