BEVFormer v2: Adapting Modern Image Backbones to Bird's-Eye-View Recognition via Perspective Supervision

被引:81
作者
Yang, Chenyu [1 ]
Chen, Yuntao [2 ]
Tian, Hao [3 ]
Tao, Chenxin [1 ]
Zhu, Xizhou [3 ]
Zhang, Zhaoxiang [2 ,4 ]
Huang, Gao [1 ]
Li, Hongyang [5 ]
Qiao, Yu [5 ]
Lu, Lewei [3 ]
Zhou, Jie [1 ]
Dai, Jifeng [1 ,5 ]
机构
[1] Tsinghua Univ, Beijing, Peoples R China
[2] Chinese Acad Sci, Ctr Artificial Intelligence & Robot, HKISI, Beijing, Peoples R China
[3] SenseTime Res, Beijing, Peoples R China
[4] Chinese Acad Sci CASIA, Inst Automat, Beijing, Peoples R China
[5] Shanghai Artificial Intelligence Lab, Shanghai, Peoples R China
来源
2023 IEEE/CVF CONFERENCE ON COMPUTER VISION AND PATTERN RECOGNITION (CVPR) | 2023年
关键词
D O I
10.1109/CVPR52729.2023.01710
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
We present a novel bird's-eye-view (BEV) detector with perspective supervision, which converges faster and better suits modern image backbones. Existing state-of-the-art BEV detectors are often tied to certain depth pre-trained backbones like VoVNet, hindering the synergy between booming image backbones and BEV detectors. To address this limitation, we prioritize easing the optimization of BEV detectors by introducing perspective view supervision. To this end, we propose a two-stage BEV detector, where proposals from the perspective head are fed into the bird's-eye-view head for final predictions. To evaluate the effectiveness of our model, we conduct extensive ablation studies focusing on the form of supervision and the generality of the proposed detector. The proposed method is verified with a wide spectrum of traditional and modern image backbones and achieves new SoTA results on the large-scale nuScenes dataset. The code shall be released soon.
引用
收藏
页码:17830 / 17839
页数:10
相关论文
共 43 条
[1]  
[Anonymous], 2022, P IEEE CVF C COMP VI, DOI DOI 10.1109/MSN57253.2022.00160
[2]  
[Anonymous], 2019, Heat Treatment and Surface Engineering
[3]  
Caesar H, 2020, PROC CVPR IEEE, P11618, DOI 10.1109/CVPR42600.2020.01164
[4]  
Carion N., 2020, P EUR C COMP VIS GLA, P213, DOI DOI 10.1007/978-3-030-58452-813
[5]  
Chen JJ, 2012, INT J FOOD ENG, V8
[6]  
Chen Qiang, 2022, arXiv preprint arXiv:220713085
[7]  
Chen X., 2017, PROC CVPR IEEE, V1, P3, DOI [DOI 10.1109/CVPR.2017.691, 10.1109/CVPR.2017.691]
[8]  
Deng J, 2009, PROC CVPR IEEE, P248, DOI 10.1109/CVPRW.2009.5206848
[9]  
Harley Adam W, 2022, ARXIV220607959
[10]  
HE KM, 2016, PROC CVPR IEEE, P770, DOI DOI 10.1109/CVPR.2016.90