Lightweight and Effective Convolutional Neural Networks for Vehicle Viewpoint Estimation From Monocular Images

被引:1
作者
Magistri, Simone [1 ]
Boschi, Marco [2 ]
Sambo, Francesco [3 ]
de Andrade, Douglas Coimbra [3 ]
Simoncini, Matteo [1 ,3 ]
Kubin, Luca [3 ]
Taccari, Leonardo [3 ]
De Luigi, Luca [2 ]
Salti, Samuele [2 ]
机构
[1] Univ Florence, Dept Informat Engn DINFO, I-50139 Florence, Italy
[2] Univ Bologna, Dept Comp Sci & Engn DISI, I-40136 Bologna, Italy
[3] Verizon Connect Res, I-50144 Florence, Italy
关键词
Azimuth; convolutional neural networks; machine learning; monocular images; vehicles; yaw;
D O I
10.1109/TITS.2022.3216359
中图分类号
TU [建筑科学];
学科分类号
0813 ;
摘要
Vehicle viewpoint estimation from monocular images is a crucial component for autonomous driving vehicles and for fleet management applications. In this paper, we make several contributions to advance the state-of-the-art on this problem. We show the effectiveness of applying a smoothing filter to the output neurons of a Convolutional Neural Network (CNN) when estimating vehicle viewpoint. We point out the overlooked fact that, under the same viewpoint, the appearance of a vehicle is strongly influenced by its position in the image plane, which renders viewpoint estimation from appearance an ill-posed problem. We show how, by inserting in the model a CoordConv layer to provide the coordinates of the vehicle, we are able to solve such ambiguity and greatly increase performance. Finally, we introduce a new data augmentation technique that improves viewpoint estimation on vehicles that are closer to the camera or partially occluded. All these improvements let a lightweight CNN reach optimal results while keeping inference time low. An extensive evaluation on a viewpoint estimation benchmark (Pascal3D+) and on actual vehicle camera data (nuScenes) shows that our method significantly outperforms the state-of-the-art in vehicle viewpoint estimation, both in terms of accuracy and memory footprint.
引用
收藏
页码:191 / 200
页数:10
相关论文
共 33 条
[1]  
[Anonymous], 2010, International journal of computer vision, DOI DOI 10.1007/s11263-009-0275-4
[2]  
[Anonymous], 2018, PATTERN RECOGN LETT, V6, P64270
[3]   A Survey on 3D Object Detection Methods for Autonomous Driving Applications [J].
Arnold, Eduardo ;
Al-Jarrah, Omar Y. ;
Dianati, Mehrdad ;
Fallah, Saber ;
Oxtoby, David ;
Mouzakitis, Alex .
IEEE TRANSACTIONS ON INTELLIGENT TRANSPORTATION SYSTEMS, 2019, 20 (10) :3782-3795
[4]  
Beckham C, 2017, PR MACH LEARN RES, V70
[5]  
Bromley J., 1993, International Journal of Pattern Recognition and Artificial Intelligence, V7, P669, DOI 10.1142/S0218001493000339
[6]  
Caesar H, 2020, Arxiv, DOI [arXiv:1903.11027, DOI 10.48550/ARXIV.1903.11027]
[7]   3D urban scene modeling integrating recognition and reconstruction [J].
Cornelis, Nico ;
Leibe, Bastian ;
Cornelis, Kurt ;
Van Gool, Luc .
INTERNATIONAL JOURNAL OF COMPUTER VISION, 2008, 78 (2-3) :121-141
[8]  
Deng J, 2009, PROC CVPR IEEE, P248, DOI 10.1109/CVPRW.2009.5206848
[9]   Viewpoint Estimation-Insights and Model [J].
Divon, Gilad ;
Tal, Ayellet .
COMPUTER VISION - ECCV 2018, PT XIV, 2018, 11218 :265-281
[10]  
Ghodrati A., 2014, PROC BRIT MACH VIS C, P1, DOI [10.5244/C.28.19, DOI 10.5244/C.28.19]