From Multi-View to Hollow-3D: Hallucinated Hollow-3D R-CNN for 3D Object Detection

被引:68
作者
Deng, Jiajun [1 ]
Zhou, Wengang [1 ]
Zhang, Yanyong [2 ]
Li, Houqiang [1 ]
机构
[1] Univ Sci & Technol China USTC, Dept Elect Engn & Informat Sci, CAS Key Lab Technol Geospatial Informat Proc & Ap, Hefei 230026, Peoples R China
[2] Univ Sci & Technol China USTC, Dept Comp Sci, Hefei 230026, Peoples R China
基金
中国国家自然科学基金;
关键词
Three-dimensional displays; Feature extraction; Proposals; Object detection; Laser radar; Detectors; Semantics; Point cloud; 3D object detection; LiDAR;
D O I
10.1109/TCSVT.2021.3100848
中图分类号
TM [电工技术]; TN [电子技术、通信技术];
学科分类号
0808 ; 0809 ;
摘要
As an emerging data modal with precise distance sensing, LiDAR point clouds have been placed great expectations on 3D scene understanding. However, point clouds are always sparsely distributed in the 3D space, and with unstructured storage, which makes it difficult to represent them for effective 3D object detection. To this end, in this work, we regard point clouds as hollow-3D data and propose a new architecture, namely Hallucinated Hollow-3D R-CNN (H(2)3D R-CNN), to address the problem of 3D object detection. In our approach, we first extract the multi-view features by sequentially projecting the point clouds into the perspective view and the bird-eye view. Then, we hallucinate the 3D representation by a novel bilaterally guided multi-view fusion block. Finally, the 3D objects are detected via a box refinement module with a novel Hierarchical Voxel RoI Pooling operation. The proposed H(2)3D R-CNN provides a new angle to take full advantage of complementary information in the perspective view and the bird-eye view with an efficient framework. We evaluate our approach on the public KITTI Dataset and Waymo Open Dataset. Extensive experiments demonstrate the superiority of our method over the state-of-the-art algorithms with respect to both effectiveness and efficiency. The code is available at https://github.com/djiajunustc/H-23D_R-CNN.
引用
收藏
页码:4722 / 4734
页数:13
相关论文
共 52 条
[31]   MobileNetV2: Inverted Residuals and Linear Bottlenecks [J].
Sandler, Mark ;
Howard, Andrew ;
Zhu, Menglong ;
Zhmoginov, Andrey ;
Chen, Liang-Chieh .
2018 IEEE/CVF CONFERENCE ON COMPUTER VISION AND PATTERN RECOGNITION (CVPR), 2018, :4510-4520
[32]   PV-RCNN: Point-Voxel Feature Set Abstraction for 3D Object Detection [J].
Shi, Shaoshuai ;
Guo, Chaoxu ;
Jiang, Li ;
Wang, Zhe ;
Shi, Jianping ;
Wang, Xiaogang ;
Li, Hongsheng .
2020 IEEE/CVF CONFERENCE ON COMPUTER VISION AND PATTERN RECOGNITION (CVPR 2020), 2020, :10526-10535
[33]   PointRCNN: 3D Object Proposal Generation and Detection from Point Cloud [J].
Shi, Shaoshuai ;
Wang, Xiaogang ;
Li, Hongsheng .
2019 IEEE/CVF CONFERENCE ON COMPUTER VISION AND PATTERN RECOGNITION (CVPR 2019), 2019, :770-779
[34]   From Points to Parts: 3D Object Detection From Point Cloud With Part-Aware and Part-Aggregation Network [J].
Shi, Shaoshuai ;
Wang, Zhe ;
Shi, Jianping ;
Wang, Xiaogang ;
Li, Hongsheng .
IEEE TRANSACTIONS ON PATTERN ANALYSIS AND MACHINE INTELLIGENCE, 2021, 43 (08) :2647-2664
[35]   Point-GNN: Graph Neural Network for 3D Object Detection in a Point Cloud [J].
Shi, Weijing ;
Rajkumar, Ragunathan .
2020 IEEE/CVF CONFERENCE ON COMPUTER VISION AND PATTERN RECOGNITION (CVPR), 2020, :1708-1716
[36]   Complex-YOLO: An Euler-Region-Proposal for Real-Time 3D Object Detection on Point Clouds [J].
Simon, Martin ;
Milz, Stefan ;
Amende, Karl ;
Gross, Horst-Michael .
COMPUTER VISION - ECCV 2018 WORKSHOPS, PT I, 2019, 11129 :197-209
[37]   Scalability in Perception for Autonomous Driving: Waymo Open Dataset [J].
Sun, Pei ;
Kretzschmar, Henrik ;
Dotiwalla, Xerxes ;
Chouard, Aurelien ;
Patnaik, Vijaysai ;
Tsui, Paul ;
Guo, James ;
Zhou, Yin ;
Chai, Yuning ;
Caine, Benjamin ;
Vasudevan, Vijay ;
Han, Wei ;
Ngiam, Jiquan ;
Zhao, Hang ;
Timofeev, Aleksei ;
Ettinger, Scott ;
Krivokon, Maxim ;
Gao, Amy ;
Joshi, Aditya ;
Zhang, Yu ;
Shlens, Jonathon ;
Chen, Zhifeng ;
Anguelov, Dragomir .
2020 IEEE/CVF CONFERENCE ON COMPUTER VISION AND PATTERN RECOGNITION (CVPR), 2020, :2443-2451
[38]   KPConv: Flexible and Deformable Convolution for Point Clouds [J].
Thomas, Hugues ;
Qi, Charles R. ;
Deschaud, Jean-Emmanuel ;
Marcotegui, Beatriz ;
Goulette, Francois ;
Guibas, Leonidas J. .
2019 IEEE/CVF INTERNATIONAL CONFERENCE ON COMPUTER VISION (ICCV 2019), 2019, :6420-6429
[39]   FCOS: Fully Convolutional One-Stage Object Detection [J].
Tian, Zhi ;
Shen, Chunhua ;
Chen, Hao ;
He, Tong .
2019 IEEE/CVF INTERNATIONAL CONFERENCE ON COMPUTER VISION (ICCV 2019), 2019, :9626-9635
[40]   Symbiotic Attention for Egocentric Action Recognition With Object-Centric Alignment [J].
Wang, Xiaohan ;
Zhu, Linchao ;
Wu, Yu ;
Yang, Yi .
IEEE TRANSACTIONS ON PATTERN ANALYSIS AND MACHINE INTELLIGENCE, 2023, 45 (06) :6605-6617