AdaPoinTr: Diverse Point Cloud Completion With Adaptive Geometry-Aware Transformers

被引：32

作者：

Yu, Xumin ^{[1
]}

Rao, Yongming ^{[1
]}

Wang, Ziyi ^{[1
]}

Lu, Jiwen ^{[1
]}

Zhou, Jie ^{[1
]}

机构：

[1] Tsinghua Univ, Beijing Natl Res Ctr Informat Sci & Technol BNRis, Dept Automat, State Key Lab Intelligent Technol & Syst, Beijing 100084, Peoples R China

来源：

IEEE TRANSACTIONS ON PATTERN ANALYSIS AND MACHINE INTELLIGENCE | 2023年 / 45卷 / 12期

关键词：

Point cloud; transformers; point cloud completion;

D O I：

10.1109/TPAMI.2023.3309253

中图分类号：

TP18 [人工智能理论];

学科分类号：

081104 ; 0812 ; 0835 ; 1405 ;

摘要：

In this paper, we propose a Transformer encoder-decoder architecture, called PoinTr, which reformulates point cloud completion as a set-to-set translation problem and employs a geometry-aware block to model local geometric relationships explicitly. The migration of Transformers enables our model to better learn structural knowledge and preserve detailed information for point cloud completion. Taking a step towards more complicated and diverse situations, we further propose AdaPoinTr by developing an adaptive query generation mechanism and designing a novel denoising task during completing a point cloud. Coupling these two techniques enables us to train the model efficiently and effectively: we reduce training time (by 15x or more) and improve completion performance (over 20%). Additionally, we propose two more challenging benchmarks with more diverse incomplete point clouds that can better reflect real-world scenarios to promote future research. We also show our method can be extended to the scene-level point cloud completion scenario by designing a new geometry-enhanced semantic scene completion framework. Extensive experiments on the existing and newly-proposed datasets demonstrate the effectiveness of our method, which attains 6.53 CD on PCN, 0.81 CD on ShapeNet-55 and 0.392 MMD on real-world KITTI, surpassing other work by a large margin and establishing new state-of-the-arts on various benchmarks. Most notably, AdaPoinTr can achieve such promising performance with higher throughputs and fewer FLOPs compared with the previous best methods in practice.

引用

页码：14114 / 14130

页数：17

共 82 条

[1]

Achlioptas P, 2018, PR MACH LEARN RES, V80

[2] 3D Semantic Parsing of Large-Scale Indoor Spaces [J].

Armeni, Iro ;

Sener, Ozan ;

Zamir, Amir R. ;

Jiang, Helen ;

Brilakis, Ioannis ;

Fischer, Martin ;

Savarese, Silvio .

2016 IEEE CONFERENCE ON COMPUTER VISION AND PATTERN RECOGNITION (CVPR), 2016, :1534-1543

[3] Attention Augmented Convolutional Networks [J].

Bello, Irwan ;

Zoph, Barret ;

Vaswani, Ashish ;

Shlens, Jonathon ;

Le, Quoc V. .

2019 IEEE/CVF INTERNATIONAL CONFERENCE ON COMPUTER VISION (ICCV 2019), 2019, :3285-3294

[4] Semantic Scene Completion via Integrating Instances and Scene in-the-Loop [J].

Cai, Yingjie ;

Chen, Xuesong ;

Zhang, Chao ;

Lin, Kwan-Yee ;

Wang, Xiaogang ;

Li, Hongsheng .

2021 IEEE/CVF CONFERENCE ON COMPUTER VISION AND PATTERN RECOGNITION, CVPR 2021, 2021, :324-333

[5] End-to-End Object Detection with Transformers [J].

Carion, Nicolas ;

Massa, Francisco ;

Synnaeve, Gabriel ;

Usunier, Nicolas ;

Kirillov, Alexander ;

Zagoruyko, Sergey .

COMPUTER VISION - ECCV 2020, PT I, 2020, 12346 :213-229

[6] 3D Sketch-aware Semantic Scene Completion via Semi-supervised Structure Prior [J].

Chen, Xiaokang ;

Lin, Kwan-Yee ;

Qian, Chen ;

Zeng, Gang ;

Li, Hongsheng .

2020 IEEE/CVF CONFERENCE ON COMPUTER VISION AND PATTERN RECOGNITION (CVPR), 2020, :4192-4201

[7] 3D-R2N2: A Unified Approach for Single and Multi-view 3D Object Reconstruction [J].

Choy, Christopher B. ;

Xu, Danfei ;

Gwak, Jun Young ;

Chen, Kevin ;

Savarese, Silvio .

COMPUTER VISION - ECCV 2016, PT VIII, 2016, 9912 :628-644

[8] Shape Completion using 3D-Encoder-Predictor CNNs and Shape Synthesis [J].

Dai, Angela ;

Qi, Charles Ruizhongtai ;

Niessner, Matthias .

30TH IEEE CONFERENCE ON COMPUTER VISION AND PATTERN RECOGNITION (CVPR 2017), 2017, :6545-6554

[9] ScanNet: Richly-annotated 3D Reconstructions of Indoor Scenes [J].

Dai, Angela ;

Chang, Angel X. ;

Savva, Manolis ;

Halber, Maciej ;

Funkhouser, Thomas ;

Niessner, Matthias .

30TH IEEE CONFERENCE ON COMPUTER VISION AND PATTERN RECOGNITION (CVPR 2017), 2017, :2432-2443

[10]

Devlin J, 2019, 2019 CONFERENCE OF THE NORTH AMERICAN CHAPTER OF THE ASSOCIATION FOR COMPUTATIONAL LINGUISTICS: HUMAN LANGUAGE TECHNOLOGIES (NAACL HLT 2019), VOL. 1, P4171

← 1 2 3 4 5 6 7 8 9 →