DisARM: Displacement Aware Relation Module for 3D Detection

被引：15

作者：

Duan, Yao ^{[1
]}

Zhu, Chenyang ^{[1
]}

Lan, Yuqing ^{[1
]}

Yi, Renjiao ^{[1
]}

Liu, Xinwang ^{[1
]}

Xu, Kai ^{[1
]}

机构：

[1] Natl Univ Def Technol, Changsha, Peoples R China

来源：

2022 IEEE/CVF CONFERENCE ON COMPUTER VISION AND PATTERN RECOGNITION (CVPR 2022) | 2022年

关键词：

D O I：

10.1109/CVPR52688.2022.01647

中图分类号：

TP18 [人工智能理论];

学科分类号：

081104 ; 0812 ; 0835 ; 1405 ;

摘要：

We introduce Displacement Aware Relation Module (DisARM), a novel neural network module for enhancing the performance of 3D object detection in point cloud scenes. The core idea is extracting the most principal contextual information is critical for detection while the target is incomplete or featureless. We find that relations between proposals provide a good representation to describe the context. However, adopting relations between all the object or patch proposals for detection is inefficient, and an imbalanced combination of local and global relations brings extra noise that could mislead the training. Rather than working with all relations, we find that training with relations only between the most representative ones, or anchors, can significantly boost the detection performance. Good anchors should be semantic-aware with no ambiguity and able to describe the whole layout of a scene with no redundancy. To find the anchors, we first perform a preliminary relation anchor module with an objectness-aware sampling approach and then devise a displacement based module for weighing the relation importance for better utilization of contextual information. This light-weight relation module leads to significantly higher accuracy of object instance detection when being plugged into the state-of-the-art detectors. Evaluations on the public benchmarks of real-world scenes show that our method achieves the state-of-the-art performance on both SUN RGB-D and ScanNet V2. The code and models are publicly available at https://github.com/YaraDuan/DisARM.

引用

页码：16959 / 16968

页数：10

共 45 条

[1] Design, Optimization, and Evaluation of Additively Manufactured Vintiles Cellular Structure for Acetabular Cup Implant [J].

Abate, Kalayu Mekonen ;

Nazir, Aamer ;

Chen, Jia-En ;

Jeng, Jeng-Ywan .

PROCESSES, 2020, 8 (01)

[2]

[Anonymous], 2014, ACM T GRAPH P SIGGRA, DOI DOI 10.1109/PCCC.2014.7017103

[3] Multi-View 3D Object Detection Network for Autonomous Driving [J].

Chen, Xiaozhi ;

Ma, Huimin ;

Wan, Ji ;

Li, Bo ;

Xia, Tian .

30TH IEEE CONFERENCE ON COMPUTER VISION AND PATTERN RECOGNITION (CVPR 2017), 2017, :6526-6534

[4] MonoPair: Monocular 3D Object Detection Using Pairwise Spatial Relationships [J].

Chen, Yongjian ;

Tai, Lei ;

Sun, Kai ;

Li, Mingyang .

2020 IEEE/CVF CONFERENCE ON COMPUTER VISION AND PATTERN RECOGNITION (CVPR 2020), 2020, :12090-12099

[5] Back-tracing Representative Points for Voting-based 3D Object Detection in Point Clouds [J].

Cheng, Bowen ;

Sheng, Lu ;

Shi, Shaoshuai ;

Yang, Ming ;

Xu, Dong .

2021 IEEE/CVF CONFERENCE ON COMPUTER VISION AND PATTERN RECOGNITION, CVPR 2021, 2021, :8959-8968

[6] Shape Completion using 3D-Encoder-Predictor CNNs and Shape Synthesis [J].

Dai, Angela ;

Qi, Charles Ruizhongtai ;

Niessner, Matthias .

30TH IEEE CONFERENCE ON COMPUTER VISION AND PATTERN RECOGNITION (CVPR 2017), 2017, :6545-6554

[7] ScanNet: Richly-annotated 3D Reconstructions of Indoor Scenes [J].

Dai, Angela ;

Chang, Angel X. ;

Savva, Manolis ;

Halber, Maciej ;

Funkhouser, Thomas ;

Niessner, Matthias .

30TH IEEE CONFERENCE ON COMPUTER VISION AND PATTERN RECOGNITION (CVPR 2017), 2017, :2432-2443

[8] Structural Relational Reasoning of Point Clouds [J].

Duan, Yueqi ;

Zheng, Yu ;

Lu, Jiwen ;

Zhou, Jie ;

Tian, Qi .

2019 IEEE/CVF CONFERENCE ON COMPUTER VISION AND PATTERN RECOGNITION (CVPR 2019), 2019, :949-958

[9] 3D-MPA: Multi Proposal Aggregation for 3D Semantic Instance Segmentation [J].

Engelmann, Francis ;

Bokeloh, Martin ;

Fathi, Alireza ;

Leibe, Bastian ;

Niessner, Matthias .

2020 IEEE/CVF CONFERENCE ON COMPUTER VISION AND PATTERN RECOGNITION (CVPR 2020), 2020, :9028-9037

[10] Exploring Spatial Context for 3D Semantic Segmentation of Point Clouds [J].

Engelmann, Francis ;

Kontogianni, Theodora ;

Hermans, Alexander ;

Leibe, Bastian .

2017 IEEE INTERNATIONAL CONFERENCE ON COMPUTER VISION WORKSHOPS (ICCVW 2017), 2017, :716-724

← 1 2 3 4 5 →