Voxel Field Fusion for 3D Object Detection

被引：62

作者：

Li, Yanwei ^{[1
,3
]}

Qi, Xiaojuan ^{[2
]}

Chen, Yukang ^{[1
,3
]}

Wang, Liwei ^{[1
,3
]}

Li, Zeming ^{[3
]}

Sun, Jian ^{[3
]}

Jia, Jiaya ^{[1
,3
,4
]}

机构：

[1] Chinese Univ Hong Kong, Hong Kong, Peoples R China

[2] Univ Hong Kong, Hong Kong, Peoples R China

[3] MEGVII Technol, Beijing, Peoples R China

[4] SmartMore, Hong Kong, Peoples R China

来源：

2022 IEEE/CVF CONFERENCE ON COMPUTER VISION AND PATTERN RECOGNITION (CVPR 2022) | 2022年

关键词：

D O I：

10.1109/CVPR52688.2022.00119

中图分类号：

TP18 [人工智能理论];

学科分类号：

081104 ; 0812 ; 0835 ; 1405 ;

摘要：

In this work, we present a conceptually simple yet effective framework for cross-modality 3D object detection, named voxel field fusion. The proposed approach aims to maintain cross-modality consistency by representing and fusing augmented image features as a ray in the voxel field. To this end, the learnable sampler is first designed to sample vital features from the image plane that are projected to the voxel grid in a point-to-ray manner, which maintains the consistency in feature representation with spatial context. In addition, ray-wise fusion is conducted to fuse features with the supplemental context in the constructed voxel field. We further develop mixed augmentor to align feature-variant transformations, which bridges the modality gap in data augmentation. The proposed framework is demonstrated to achieve consistent gains in various benchmarks and outperforms previous fusion-based methods on KITTI and nuScenes datasets. Code is made available at https://github.com/dvlab-research/VFF.(1)

引用

页码：1110 / 1119

页数：10

共 52 条

[1]

[Anonymous], 2020, CVPR, DOI DOI 10.1109/CVPR42600.2020.00466

[2]

[Anonymous], 2019, CVPR, DOI DOI 10.1109/CVPR.2019.01298

[3]

[Anonymous], 2018, ECCV, DOI DOI 10.1007/978-3-030-01258-8_23

[4]

[Anonymous], 2015, NEURIPS

[5]

[Anonymous], 2019, CVPR, DOI DOI 10.1109/CVPR.2019.00826

[6] M3D-RPN: Monocular 3D Region Proposal Network for Object Detection [J].

Brazil, Garrick ;

Liu, Xiaoming .

2019 IEEE/CVF INTERNATIONAL CONFERENCE ON COMPUTER VISION (ICCV 2019), 2019, :9286-9295

[7] nuScenes: A multimodal dataset for autonomous driving [J].

Caesar, Holger ;

Bankiti, Varun ;

Lang, Alex H. ;

Vora, Sourabh ;

Liong, Venice Erin ;

Xu, Qiang ;

Krishnan, Anush ;

Pan, Yu ;

Baldan, Giancarlo ;

Beijbom, Oscar .

2020 IEEE/CVF CONFERENCE ON COMPUTER VISION AND PATTERN RECOGNITION (CVPR 2020), 2020, :11618-11628

[8]

Chen LB, 2017, IEEE INT SYMP NANO, P1, DOI 10.1109/NANOARCH.2017.8053709

[9]

Chen Rui, 2019, ICCV

[10] Multi-View 3D Object Detection Network for Autonomous Driving [J].

Chen, Xiaozhi ;

Ma, Huimin ;

Wan, Ji ;

Li, Bo ;

Xia, Tian .

30TH IEEE CONFERENCE ON COMPUTER VISION AND PATTERN RECOGNITION (CVPR 2017), 2017, :6526-6534

← 1 2 3 4 5 6 →