Understanding Bird's-hye View of Road Semantics Using an Onboard Camera

被引:27
作者
Can, Yigit Baran [1 ]
Liniger, Alexander [1 ]
Unal, Ozan [1 ]
Paudel, Danda [1 ]
Gool, Luc Van [1 ,2 ]
机构
[1] Swiss Fed Inst Technol, Comp Vis Lab, CH-8057 Zurich, Switzerland
[2] Katholieke Univ Leuven, B-3000 Leuven, Belgium
关键词
Autonomous vehicle navigation; semantic scene understanding; Mapping; SIMULTANEOUS LOCALIZATION;
D O I
10.1109/LRA.2022.3146898
中图分类号
TP24 [机器人技术];
学科分类号
080202 ; 1405 ;
摘要
Autonomous navigation requires scene understanding of the action-space to move or anticipate events. For planner agents moving on the ground plane, such as autonomous vehicles, this translates to scene understanding in the bird's-eye view (BEV). How ever, the onboard cameras of autonomous cars are customarily mounted horizontally for a better view of the surrounding. In this work, we study scene understanding in the form of online estimation of semantic BEV maps using the video input from a single onboard camera. We study three key aspects of this task, image-level understanding, BEV level understanding, and the aggregation of temporal information. Based on these three pillars we propose a novel architecture that combines these three aspects. In our extensive experiments, we demonstrate that the considered aspects are complementary to each other for BEV understanding. Furthermore, the proposed architecture significantly surpasses the current state-of-the-art. Code: https://github.com/ybarancan/BEV_feat_stitch.
引用
收藏
页码:3302 / 3309
页数:8
相关论文
共 34 条
[1]  
Bansal M, 2019, ROBOTICS: SCIENCE AND SYSTEMS XV
[2]   nuScenes: A multimodal dataset for autonomous driving [J].
Caesar, Holger ;
Bankiti, Varun ;
Lang, Alex H. ;
Vora, Sourabh ;
Liong, Venice Erin ;
Xu, Qiang ;
Krishnan, Anush ;
Pan, Yu ;
Baldan, Giancarlo ;
Beijbom, Oscar .
2020 IEEE/CVF CONFERENCE ON COMPUTER VISION AND PATTERN RECOGNITION (CVPR 2020), 2020, :11618-11628
[3]   Structured Bird's-Eye-View Traffic Scene Understanding from Onboard Images [J].
Can, Yigit Baran ;
Liniger, Alexander ;
Paudel, Danda Pani ;
Van Gool, Luc .
2021 IEEE/CVF INTERNATIONAL CONFERENCE ON COMPUTER VISION (ICCV 2021), 2021, :15641-15650
[4]  
Casas S, 2018, PR MACH LEARN RES, V87
[5]   Argoverse: 3D Tracking and Forecasting with Rich Maps [J].
Chang, Ming-Fang ;
Lambert, John ;
Sangkloy, Patsorn ;
Singh, Jagjeet ;
Bak, Slawomir ;
Hartnett, Andrew ;
Wang, De ;
Carr, Peter ;
Lucey, Simon ;
Ramanan, Deva ;
Hays, James .
2019 IEEE/CVF CONFERENCE ON COMPUTER VISION AND PATTERN RECOGNITION (CVPR 2019), 2019, :8740-8749
[6]  
Chen Dian, 2020, CORL, P66
[7]   Panoptic-DeepLab: A Simple, Strong, and Fast Baseline for Bottom-Up Panoptic Segmentation [J].
Cheng, Bowen ;
Collins, Maxwell D. ;
Zhu, Yukun ;
Liu, Ting ;
Huang, Thomas S. ;
Adam, Hartwig ;
Chen, Liang-Chieh .
2020 IEEE/CVF CONFERENCE ON COMPUTER VISION AND PATTERN RECOGNITION (CVPR 2020), 2020, :12472-12482
[8]   The Cityscapes Dataset for Semantic Urban Scene Understanding [J].
Cordts, Marius ;
Omran, Mohamed ;
Ramos, Sebastian ;
Rehfeld, Timo ;
Enzweiler, Markus ;
Benenson, Rodrigo ;
Franke, Uwe ;
Roth, Stefan ;
Schiele, Bernt .
2016 IEEE CONFERENCE ON COMPUTER VISION AND PATTERN RECOGNITION (CVPR), 2016, :3213-3223
[9]  
Cui HG, 2019, IEEE INT CONF ROBOT, P2090, DOI [10.1109/icra.2019.8793868, 10.1109/ICRA.2019.8793868]
[10]   Simultaneous localization and mapping: Part I [J].
Durrant-Whyte, Hugh ;
Bailey, Tim .
IEEE ROBOTICS & AUTOMATION MAGAZINE, 2006, 13 (02) :99-108