Bags of tricks for learning depth and camera motion from monocular videos

被引:0
作者
Dong B. [1 ]
Sheng L. [2 ]
机构
[1] School of Computer Science and Technology, Harbin Institute of Technology, Harbin
[2] College of Software, Beihang University, Beijing
来源
Virtual Reality and Intelligent Hardware | 2019年 / 1卷 / 05期
关键词
Monocular visual odometry; Unsupervised learning;
D O I
10.1016/j.vrih.2019.09.004
中图分类号
学科分类号
摘要
Background: Based on the seminal work proposed by Zhou et al., much of the recent progress in learning monocular visual odometry, i. e. depth and camera motion from monocular videos, can be attributed to the tricks in the training procedure, such as data augmentation and learning objectives. Methods: Herein, we categorize a collection of such tricks through the theoretical examination and empirical evaluation of their effects on the final accuracy of the visual odometry. Results/Conclusions: By combining the aforementioned tricks, we were able to significantly improve a baseline model adapted from SfMLearner without additional inference costs. Furthermore, we analyzed the principles of these tricks and the reason for their success. Practical guidelines for future research are also presented. © 2019 Beijing Zhongke Journal Publishing Co. Ltd
引用
收藏
页码:500 / 510
页数:10
相关论文
共 21 条
[11]  
Yin Z.C., Shi J.P., GeoNet: unsupervised learning of dense depth, optical flow and camera pose, 2018 IEEE/CVF Conference on Computer Vision and Pattern Recognition, (2018)
[12]  
Meister S., Hur J., Roth S., UnFlow: Unsupervised Learning of Optical Flow with a Bidirectional Census Loss, Thirty- Second AAAI Conference on Artificial Intelligence, (2018)
[13]  
Cao Z., Kar A., Hane C., Malik J., Learning Independent Object Motion from Unlabelled Stereoscopic Videos, IEEE Conference on Computer Vision and Pattern Recognition (CVPR), pp. 5594-5603, (2019)
[14]  
Lv Z., Kim K., Troccoli A., Sun D.Q., Rehg J.M., Kautz J., Learning rigidity in dynamic scenes with a moving camera for 3D motion field estimation//Computer Vision–ECCV 2018, pp. 484-501, (2018)
[15]  
Yang Z.H., Wang P., Wang Y., Xu W., Nevatia R., Every pixel counts: unsupervised geometry learning with holistic 3D motion understanding//Lecture Notes in Computer Science, pp. 691-709, (2019)
[16]  
Ranjan A., Jampani V., Balles L., Kim K., Sun D., Wulff J., Black M.J., Competitive Collaboration: Joint Unsupervised Learning of Depth, Camera Motion, Optical Flow and Motion Segmentation, IEEE Conference on Computer Vision and Pattern Recognition (CVPR), pp. 12240-12249, (2019)
[17]  
Zou Y.L., Luo Z.L., Huang J.B., DF-net: unsupervised joint learning of depth and flow using cross-task consistency// Computer Vision–ECCV 2018, pp. 38-55, (2018)
[18]  
Sun D.Q., Yang X.D., Liu M.Y., Kautz J., PWC-net: CNNs for optical flow using pyramid, warping, and cost volume, 2018 IEEE/CVF Conference on Computer Vision and Pattern Recognition, (2018)
[19]  
Liu F.Y., Shen C.H., Lin G.S., Reid I., Learning Depth from Single Monocular Images using Deep Convolutional Neural Fields, IEEE Transactions on Pattern Analysis and Machine Intelligence (T-PAMI), 38, 10, pp. 2024-2039, (2016)
[20]  
Garg R., Vijay K.B.G., Carneiro G., Reid I., Unsupervised CNN for single view depth estimation: geometry to the rescue// Computer Vision–ECCV 2016, pp. 740-756, (2016)