Homography Loss for Monocular 3D Object Detection

被引：20

作者：

Gu, Jiaqi ^{[1
,2
]}

Wu, Bojian ^{[1
]}

Fan, Lubin ^{[1
]}

Huang, Jianqiang ^{[1
]}

Cao, Shen ^{[1
]}

Xiang, Zhiyu ^{[2
]}

Hua, Xian-Sheng ^{[1
]}

机构：

[1] Alibaba Cloud Comp Ltd, Hangzhou, Peoples R China

[2] Zhejiang Univ, Hangzhou, Peoples R China

来源：

2022 IEEE/CVF CONFERENCE ON COMPUTER VISION AND PATTERN RECOGNITION (CVPR 2022) | 2022年

基金：

国家重点研发计划;

关键词：

D O I：

10.1109/CVPR52688.2022.00115

中图分类号：

TP18 [人工智能理论];

学科分类号：

081104 ; 0812 ; 0835 ; 1405 ;

摘要：

Monocular 3D object detection is an essential task in autonomous driving. However, most current methods consider each 3D object in the scene as an independent training sample, while ignoring their inherent geometric relations, thus inevitably resulting in a lack of leveraging spatial constraints. In this paper, we propose a novel method that takes all the objects into consideration and explores their mutual relationships to help better estimate the 3D boxes. Moreover, since 2D detection is more reliable currently, we also investigate how to use the detected 2D boxes as guidance to globally constrain the optimization of the corresponding predicted 3D boxes. To this end, a differentiable loss function, termed as Homography Loss, is proposed to achieve the goal, which exploits both 2D and 3D information, aiming at balancing the positional relationships between different objects by global constraints, so as to obtain more accurately predicted 3D boxes. Thanks to the concise design, our loss function is universal and can be plugged into any mature monocular 3D detector, while significantly boosting the performance over their baseline. Experiments demonstrate that our method yields the best performance (Nov. 2021) compared with the other state-of-the-arts by a large margin on KITTI 3D datasets.

引用

页码：1070 / 1079

页数：10

共 45 条

[1]

Brazil Garrick, 2019, P IEEE CVF INT C COM

[2]

Brazil Garrick, 2020, EUR C COMP VIS ECCV

[3] Deep MANTA: A Coarse-to-fine Many-Task Network for joint 2D and 3D vehicle analysis from monocular image [J].

Chabot, Florian ;

Chaouch, Mohamed ;

Rabarisoa, Jaonary ;

Teuliere, Celine ;

Chateau, Thierry .

30TH IEEE CONFERENCE ON COMPUTER VISION AND PATTERN RECOGNITION (CVPR 2017), 2017, :1827-1836

[4]

Chen Hanshen, 2021, P IEEE CVF C COMP VI

[5] Monocular 3D Object Detection for Autonomous Driving [J].

Chen, Xiaozhi ;

Kundu, Kaustav ;

Zhang, Ziyu ;

Ma, Huimin ;

Fidler, Sanja ;

Urtasun, Raquel .

2016 IEEE CONFERENCE ON COMPUTER VISION AND PATTERN RECOGNITION (CVPR), 2016, :2147-2156

[6]

Chen XZ, 2015, ADV NEUR IN, V28

[7]

Chen Yongjian, 2020, P IEEE CVF C COMP VI

[8]

Ding Mingyu, 2020, IEEE CVF C COMP VIS

[9]

Fang Jiaojiao, 2019, ARXIV190901867

[10]

Geiger Andreas, 2012, P IEEE CVF C COMP VI

← 1 2 3 4 5 →