PT-FlowNet: Scene Flow Estimation on Point Clouds With Point Transformer

被引:18
作者
Fu, Jingyun [1 ]
Xiang, Zhiyu [2 ]
Qiao, Chengyu [1 ]
Bai, Tingming [1 ]
机构
[1] Zhejiang Univ, Coll Informat Sci & Eletron Engn, Hangzhou 310027, Peoples R China
[2] Zhejiang Univ, Zhejiang Prov Key Lab Informat Proc Commun & Netwo, Hangzhou 310027, Peoples R China
关键词
Point cloud compression; Estimation; Transformers; Three-dimensional displays; Feature extraction; Iterative methods; Task analysis; Computer vision for automation; vision-based navigation;
D O I
10.1109/LRA.2023.3254431
中图分类号
TP24 [机器人技术];
学科分类号
080202 ; 1405 ;
摘要
As a low-level task of 3D perception, scene flow is a fundamental representation of dynamic scenes and provides non-rigid motion descriptions for the objects in the 3D environment, which can strongly support many upper-level applications. Inspired by the revolutionary success of deep learning, many attention-based neural networks have recently been proposed to estimate scene flow from consecutive point clouds. However, extracting effective features and estimating accurate point motions for irregular and occluded point clouds remains a challenging task. In this letter, we propose PT-FlowNet, the first end-to-end scene flow estimation network embedding the point transformer (PT) into all functional stages of the task. In particular, we design novel PT-based modules for point feature extraction, iterative flow update, and flow refinement stage to encourage effective point-level feature aggregation. Experimental results on FlyingThings3D and KITTI datasets show that our PT-FlowNet achieves state-of-the-art performance. Trained on synthetic data only, our PT-FlowNet can generalize to real-world scans and outperforms the existing methods by at least 36.2% for the EPE3D metric on the KITTI dataset.
引用
收藏
页码:2566 / 2573
页数:8
相关论文
共 48 条
[1]  
Alcantarilla PF, 2012, IEEE INT CONF ROBOT, P1290, DOI 10.1109/ICRA.2012.6224690
[2]   RMS-FlowNet: Efficient and Robust Multi-Scale Scene Flow Estimation for Large-Scale Point Clouds [J].
Battrawy, Ramy ;
Schuster, Rene ;
Mahani, Mohammad-Ali Nikouei ;
Stricker, Didier .
2022 IEEE INTERNATIONAL CONFERENCE ON ROBOTICS AND AUTOMATION (ICRA 2022), 2022,
[3]   PointFlowNet: Learning Representations for Rigid Motion Estimation from Point Clouds [J].
Behl, Aseem ;
Paschalidou, Despoina ;
Donne, Simon ;
Geiger, Andreas .
2019 IEEE/CVF CONFERENCE ON COMPUTER VISION AND PATTERN RECOGNITION (CVPR 2019), 2019, :7954-7963
[4]  
BESL PJ, 1992, P SOC PHOTO-OPT INS, V1611, P586, DOI 10.1117/12.57955
[5]   Bi-PointFlowNet: Bidirectional Learning for Point Cloud Based Scene Flow Estimation [J].
Cheng, Wencan ;
Ko, Jong Hwan .
COMPUTER VISION - ECCV 2022, PT XXVIII, 2022, 13688 :108-124
[6]  
Cho K., 2014, ARXIV, DOI [10.3115/v1/W14-4012, DOI 10.3115/V1/W14-4012]
[7]   4D Spatio-Temporal ConvNets: Minkowski Convolutional Neural Networks [J].
Choy, Christopher ;
Gwak, JunYoung ;
Savarese, Silvio .
2019 IEEE/CVF CONFERENCE ON COMPUTER VISION AND PATTERN RECOGNITION (CVPR 2019), 2019, :3070-3079
[8]   Shape Completion using 3D-Encoder-Predictor CNNs and Shape Synthesis [J].
Dai, Angela ;
Qi, Charles Ruizhongtai ;
Niessner, Matthias .
30TH IEEE CONFERENCE ON COMPUTER VISION AND PATTERN RECOGNITION (CVPR 2017), 2017, :6545-6554
[9]   Vision meets robotics: The KITTI dataset [J].
Geiger, A. ;
Lenz, P. ;
Stiller, C. ;
Urtasun, R. .
INTERNATIONAL JOURNAL OF ROBOTICS RESEARCH, 2013, 32 (11) :1231-1237
[10]   RCP: Recurrent Closest Point for Point Cloud [J].
Gu, Xiaodong ;
Tang, Chengzhou ;
Yuan, Weihao ;
Dai, Zuozhuo ;
Zhu, Siyu ;
Tan, Ping .
2022 IEEE/CVF CONFERENCE ON COMPUTER VISION AND PATTERN RECOGNITION (CVPR), 2022, :8206-8216