SLIM: Self-Supervised LiDAR Scene Flow and Motion Segmentation

被引:58
作者
Baur, Stefan Andreas [1 ,2 ]
Emmerichs, David Josef [1 ]
Moosmann, Frank [1 ]
Pinggera, Peter [1 ]
Ommer, Bjoern [4 ,5 ]
Geiger, Andreas [2 ,3 ]
机构
[1] Mercedes Benz AG, Stuttgart, Germany
[2] MPI IS, Tubingen, Germany
[3] Univ Tubingen, Tubingen, Germany
[4] Ludwig Maximilian Univ Munich, Munich, Germany
[5] Heidelberg Univ, Heidelberg, Germany
来源
2021 IEEE/CVF INTERNATIONAL CONFERENCE ON COMPUTER VISION (ICCV 2021) | 2021年
关键词
D O I
10.1109/ICCV48922.2021.01288
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
Recently, several frameworks for self-supervised learning of 3D scene flow on point clouds have emerged. Scene flow inherently separates every scene into multiple moving agents and a large class of points following a single rigid sensor motion. However, existing methods do not leverage this property of the data in their self-supervised training routines which could improve and stabilize flow predictions. Based on the discrepancy between a robust rigid ego-motion estimate and a raw flow prediction, we generate a self-supervised motion segmentation signal. The predicted motion segmentation, in turn, is used by our algorithm to attend to stationary points for aggregation of motion information in static parts of the scene. We learn our model end-to-end by backpropagating gradients through Kabsch's algorithm and demonstrate that this leads to accurate ego-motion which in turn improves the scene flow estimate. Using our method, we show state-of-the-art results across multiple scene flow metrics for different real-world datasets, showcasing the robustness and generalizability of this approach. We further analyze the performance gain when performing joint motion segmentation and scene flow in an ablation study. We also present a novel network architecture for 3D LiDAR scene flow which is capable of handling an order of magnitude more points during training than previously possible.
引用
收藏
页码:13106 / 13116
页数:11
相关论文
共 59 条
[1]  
AndreasWedel Clemens, 2008, LECT NOTES COMPUTER, V5302, P739
[2]   Multi-view Scene Flow Estimation: A View Centered Variational Approach [J].
Basha, Tali ;
Moses, Yael ;
Kiryati, Nahum .
INTERNATIONAL JOURNAL OF COMPUTER VISION, 2013, 101 (01) :6-21
[3]   PointFlowNet: Learning Representations for Rigid Motion Estimation from Point Clouds [J].
Behl, Aseem ;
Paschalidou, Despoina ;
Donne, Simon ;
Geiger, Andreas .
2019 IEEE/CVF CONFERENCE ON COMPUTER VISION AND PATTERN RECOGNITION (CVPR 2019), 2019, :7954-7963
[4]   SemanticKITTI: A Dataset for Semantic Scene Understanding of LiDAR Sequences [J].
Behley, Jens ;
Garbade, Martin ;
Milioto, Andres ;
Quenzel, Jan ;
Behnke, Sven ;
Stachniss, Cyrill ;
Gall, Juergen .
2019 IEEE/CVF INTERNATIONAL CONFERENCE ON COMPUTER VISION (ICCV 2019), 2019, :9296-9306
[5]   MoA-Net: Self-supervised Motion Segmentation [J].
Bideau, Pia ;
Menon, Rakesh R. ;
Learned-Miller, Erik .
COMPUTER VISION - ECCV 2018 WORKSHOPS, PT VI, 2019, 11134 :715-730
[6]   nuScenes: A multimodal dataset for autonomous driving [J].
Caesar, Holger ;
Bankiti, Varun ;
Lang, Alex H. ;
Vora, Sourabh ;
Liong, Venice Erin ;
Xu, Qiang ;
Krishnan, Anush ;
Pan, Yu ;
Baldan, Giancarlo ;
Beijbom, Oscar .
2020 IEEE/CVF CONFERENCE ON COMPUTER VISION AND PATTERN RECOGNITION (CVPR 2020), 2020, :11618-11628
[7]   Multi-view scene capture by surfel sampling: From video streams to non-rigid 3D motion, shape and reflectance [J].
Carceroni, RL ;
Kutulakos, KN .
INTERNATIONAL JOURNAL OF COMPUTER VISION, 2002, 49 (2-3) :175-214
[8]   Self-supervised Learning with Geometric Constraints in Monocular Video Connecting Flow, Depth, and Camera [J].
Chen, Yuhua ;
Schmid, Cordelia ;
Sminchisescu, Cristian .
2019 IEEE/CVF INTERNATIONAL CONFERENCE ON COMPUTER VISION (ICCV 2019), 2019, :7062-7071
[9]  
Dosovitskiy A, 2017, PR MACH LEARN RES, V78
[10]   Vision meets robotics: The KITTI dataset [J].
Geiger, A. ;
Lenz, P. ;
Stiller, C. ;
Urtasun, R. .
INTERNATIONAL JOURNAL OF ROBOTICS RESEARCH, 2013, 32 (11) :1231-1237