Monocular 3D Object Detection via Feature Domain Adaptation

被引：31

作者：

Ye, Xiaoqing ^{[1
]}

Du, Liang ^{[2
,3
]}

Shi, Yifeng ^{[1
]}

Li, Yingying ^{[1
]}

Tan, Xiao ^{[1
]}

Feng, Jianfeng ^{[2
,3
]}

Ding, Errui ^{[1
]}

Wen, Shilei ^{[1
]}

机构：

[1] Baidu Inc, Beijing, Peoples R China

[2] Fudan Univ, Inst Sci & Technol Brain Inspired Intelligence, Shanghai, Peoples R China

[3] Fudan Univ, Key Lab Computat Neurosci & Brain Inspired Intell, Minist Educ, Shanghai, Peoples R China

来源：

COMPUTER VISION - ECCV 2020, PT IX | 2020年 / 12354卷

基金：

国家重点研发计划; 中国国家自然科学基金;

关键词：

Monocular; 3D Object detection; Domain adaptation; Pseudo-Lidar;

D O I：

10.1007/978-3-030-58545-7_2

中图分类号：

TP18 [人工智能理论];

学科分类号：

081104 ; 0812 ; 0835 ; 1405 ;

摘要：

Monocular 3D object detection is a challenging task due to unreliable depth, resulting in a distinct performance gap between monocular and LiDAR-based approaches. In this paper, we propose a novel domain adaptation based monocular 3D object detection framework named DA-3Ddet, which adapts the feature from unsound image-based pseudo-LiDAR domain to the accurate real LiDAR domain for performance boosting. In order to solve the overlooked problem of inconsistency between the foreground mask of pseudo and real LiDAR caused by inaccurately estimated depth, we also introduce a context-aware foreground segmentation module which helps to involve relevant points for foreground masking. Extensive experiments on KITTI dataset demonstrate that our simple yet effective framework outperforms other state-of-the-arts by a large margin.

引用

页码：17 / 34

页数：18

共 47 条

[1]

Achlioptas P, 2018, Arxiv, DOI arXiv:1707.02392

[2]

Alhashim I, 2019, Arxiv, DOI arXiv:1812.11941

[3] Monocular Video-Based Trailer Coupler Detection using Multiplexer Convolutional Neural Network [J].

Atoum, Yousef ;

Roth, Joseph ;

Bliss, Michael ;

Zhang, Wende ;

Liu, Xiaoming .

2017 IEEE INTERNATIONAL CONFERENCE ON COMPUTER VISION (ICCV), 2017, :5478-5486

[4] Efficient 3D object detection by fitting superquadrics to range image data for robot's object manipulation [J].

Biegelbauer, Georg ;

Vincze, Markus .

PROCEEDINGS OF THE 2007 IEEE INTERNATIONAL CONFERENCE ON ROBOTICS AND AUTOMATION, VOLS 1-10, 2007, :1086-+

[5] M3D-RPN: Monocular 3D Region Proposal Network for Object Detection [J].

Brazil, Garrick ;

Liu, Xiaoming .

2019 IEEE/CVF INTERNATIONAL CONFERENCE ON COMPUTER VISION (ICCV 2019), 2019, :9286-9295

[6]

Cai YJ, 2021, Arxiv, DOI arXiv:2002.01619

[7] Deep MANTA: A Coarse-to-fine Many-Task Network for joint 2D and 3D vehicle analysis from monocular image [J].

Chabot, Florian ;

Chaouch, Mohamed ;

Rabarisoa, Jaonary ;

Teuliere, Celine ;

Chateau, Thierry .

30TH IEEE CONFERENCE ON COMPUTER VISION AND PATTERN RECOGNITION (CVPR 2017), 2017, :1827-1836

[8] Pyramid Stereo Matching Network [J].

Chang, Jia-Ren ;

Chen, Yong-Sheng .

2018 IEEE/CVF CONFERENCE ON COMPUTER VISION AND PATTERN RECOGNITION (CVPR), 2018, :5410-5418

[9] Multi-View 3D Object Detection Network for Autonomous Driving [J].

Chen, Xiaozhi ;

Ma, Huimin ;

Wan, Ji ;

Li, Bo ;

Xia, Tian .

30TH IEEE CONFERENCE ON COMPUTER VISION AND PATTERN RECOGNITION (CVPR 2017), 2017, :6526-6534

[10] Monocular 3D Object Detection for Autonomous Driving [J].

Chen, Xiaozhi ;

Kundu, Kaustav ;

Zhang, Ziyu ;

Ma, Huimin ;

Fidler, Sanja ;

Urtasun, Raquel .

2016 IEEE CONFERENCE ON COMPUTER VISION AND PATTERN RECOGNITION (CVPR), 2016, :2147-2156

← 1 2 3 4 5 →