Relation Graph Network for 3D Object Detection in Point Clouds

被引:55
作者
Feng, Mingtao [1 ]
Gilani, Syed Zulqarnain [2 ,3 ]
Wang, Yaonan [4 ]
Zhang, Liang [1 ,5 ]
Mian, Ajmal [2 ]
机构
[1] Xidian Univ, Sch Comp Sci & Technol, Xian 710071, Peoples R China
[2] Univ Western Australia, Dept Comp Sci & Software Engn, Perth, WA 6009, Australia
[3] Edith Cowan Univ, Sch Sci, Joondalup, WA 6027, Australia
[4] Hunan Univ, Coll Elect & Informat Engn, Changsha 410082, Peoples R China
[5] Shanghai BNC, Shanghai 200072, Peoples R China
基金
中国国家自然科学基金; 澳大利亚研究理事会;
关键词
Three-dimensional displays; Proposals; Object detection; Two dimensional displays; Feature extraction; Laser radar; Semantics; 3D object detection; point cloud; deep learning; SEGMENTATION;
D O I
10.1109/TIP.2020.3031371
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
Convolutional Neural Networks (CNNs) have emerged as a powerful tool for object detection in 2D images. However, their power has not been fully realised for detecting 3D objects directly in point clouds without conversion to regular grids. Moreover, existing state-of-the-art 3D object detection methods aim to recognize objects individually without exploiting their relationships during learning or inference. In this article, we first propose a strategy that associates the predictions of direction vectors with pseudo geometric centers, leading to a win-win solution for 3D bounding box candidates regression. Secondly, we propose point attention pooling to extract uniform appearance features for each 3D object proposal, benefiting from the learned direction features, semantic features and spatial coordinates of the object points. Finally, the appearance features are used together with the position features to build 3D object-object relationship graphs for all proposals to model their co-existence. We explore the effect of relation graphs on proposals' appearance feature enhancement under supervised and unsupervised settings. The proposed relation graph network comprises a 3D object proposal generation module and a 3D relation module, making it an end-to-end trainable network for detecting 3D objects in point clouds. Experiments on challenging benchmark point cloud datasets (SunRGB-D, ScanNet and KITTI) show that our algorithm performs better than existing state-of-the-art.
引用
收藏
页码:92 / 107
页数:16
相关论文
共 75 条
[1]  
[Anonymous], 2019, ADV NEURAL INFORM PR, DOI DOI 10.1109/CVPR.2019.00752
[2]  
[Anonymous], 2017, ARXIV171107264
[3]  
[Anonymous], 2018, P EUR C COMP VIS ECC
[4]  
[Anonymous], 2014 IEEE C COMP VIS
[5]  
[Anonymous], 2014, P 2 INT C LEARN REPR, DOI DOI 10.1016/J.VISRES.2006.11.009
[6]  
[Anonymous], 2017, P INT C NEURAL INFOR
[7]   MonoFENet: Monocular 3D Object Detection With Feature Enhancement Networks [J].
Bao, Wentao ;
Xu, Bin ;
Chen, Zhenzhong .
IEEE TRANSACTIONS ON IMAGE PROCESSING, 2020, 29 :2753-2765
[8]   Deep Unsupervised Learning of 3D Point Clouds via Graph Topology Inference and Filtering [J].
Chen, Siheng ;
Duan, Chaojing ;
Yang, Yaoqing ;
Li, Duanshun ;
Feng, Chen ;
Tian, Dong .
IEEE TRANSACTIONS ON IMAGE PROCESSING, 2020, 29 :3183-3198
[9]   Graph-Based Global Reasoning Networks [J].
Chen, Yunpeng ;
Rohrbach, Marcus ;
Yan, Zhicheng ;
Yan, Shuicheng ;
Feng, Jiashi ;
Kalantidis, Yannis .
2019 IEEE/CVF CONFERENCE ON COMPUTER VISION AND PATTERN RECOGNITION (CVPR 2019), 2019, :433-442
[10]   Attention-based Dropout Layer for Weakly Supervised Object Localization [J].
Choe, Junsuk ;
Shim, Hyunjung .
2019 IEEE/CVF CONFERENCE ON COMPUTER VISION AND PATTERN RECOGNITION (CVPR 2019), 2019, :2214-2223