PT-FlowNet: Scene Flow Estimation on Point Clouds With Point Transformer

被引:19
作者
Fu, Jingyun [1 ]
Xiang, Zhiyu [2 ]
Qiao, Chengyu [1 ]
Bai, Tingming [1 ]
机构
[1] Zhejiang Univ, Coll Informat Sci & Eletron Engn, Hangzhou 310027, Peoples R China
[2] Zhejiang Univ, Zhejiang Prov Key Lab Informat Proc Commun & Netwo, Hangzhou 310027, Peoples R China
关键词
Point cloud compression; Estimation; Transformers; Three-dimensional displays; Feature extraction; Iterative methods; Task analysis; Computer vision for automation; vision-based navigation;
D O I
10.1109/LRA.2023.3254431
中图分类号
TP24 [机器人技术];
学科分类号
080202 ; 1405 ;
摘要
As a low-level task of 3D perception, scene flow is a fundamental representation of dynamic scenes and provides non-rigid motion descriptions for the objects in the 3D environment, which can strongly support many upper-level applications. Inspired by the revolutionary success of deep learning, many attention-based neural networks have recently been proposed to estimate scene flow from consecutive point clouds. However, extracting effective features and estimating accurate point motions for irregular and occluded point clouds remains a challenging task. In this letter, we propose PT-FlowNet, the first end-to-end scene flow estimation network embedding the point transformer (PT) into all functional stages of the task. In particular, we design novel PT-based modules for point feature extraction, iterative flow update, and flow refinement stage to encourage effective point-level feature aggregation. Experimental results on FlyingThings3D and KITTI datasets show that our PT-FlowNet achieves state-of-the-art performance. Trained on synthetic data only, our PT-FlowNet can generalize to real-world scans and outperforms the existing methods by at least 36.2% for the EPE3D metric on the KITTI dataset.
引用
收藏
页码:2566 / 2573
页数:8
相关论文
共 48 条
[41]  
Chang AX, 2015, Arxiv, DOI arXiv:1512.03012
[42]   SOE-Net: A Self-Attention and Orientation Encoding Network for Point Cloud based Place Recognition [J].
Xia, Yan ;
Xu, Yusheng ;
Li, Shuang ;
Wang, Rui ;
Du, Juan ;
Cremers, Daniel ;
Stilla, Uwe .
2021 IEEE/CVF CONFERENCE ON COMPUTER VISION AND PATTERN RECOGNITION, CVPR 2021, 2021, :11343-11352
[43]   PointASNL: Robust Point Clouds Processing using Nonlocal Neural Networks with Adaptive Sampling [J].
Yan, Xu ;
Zheng, Chaoda ;
Li, Zhen ;
Wang, Sheng ;
Cui, Shuguang .
2020 IEEE/CVF CONFERENCE ON COMPUTER VISION AND PATTERN RECOGNITION (CVPR), 2020, :5588-5597
[44]  
Zhang Y., 2020, P ASIAN C COMPUTER V
[45]   Point Transformer [J].
Zhao, Hengshuang ;
Jiang, Li ;
Jia, Jiaya ;
Torr, Philip ;
Koltun, Vladlen .
2021 IEEE/CVF INTERNATIONAL CONFERENCE ON COMPUTER VISION (ICCV 2021), 2021, :16239-16248
[46]   Exploring Self-attention for Image Recognition [J].
Zhao, Hengshuang ;
Jia, Jiaya ;
Koltun, Vladlen .
2020 IEEE/CVF CONFERENCE ON COMPUTER VISION AND PATTERN RECOGNITION (CVPR 2020), 2020, :10073-10082
[47]   PointWeb: Enhancing Local Neighborhood Features for Point Cloud Processing [J].
Zhao, Hengshuang ;
Jiang, Li ;
Fu, Chi-Wing ;
Jia, Jiaya .
2019 IEEE/CVF CONFERENCE ON COMPUTER VISION AND PATTERN RECOGNITION (CVPR 2019), 2019, :5550-5558
[48]   Adaptive Graph Convolution for Point Cloud Analysis [J].
Zhou, Haoran ;
Feng, Yidan ;
Fang, Mingsheng ;
Wei, Mingqiang ;
Qin, Jing ;
Lu, Tong .
2021 IEEE/CVF INTERNATIONAL CONFERENCE ON COMPUTER VISION (ICCV 2021), 2021, :4945-4954