Empty Cities: A Dynamic-Object-Invariant Space for Visual SLAM

被引:24
作者
Bescos, Berta [1 ]
Cadena, Cesar [2 ]
Neira, Jose [3 ]
机构
[1] Univ Zaragoza, Dept Comp Sci & Syst Engn, Zaragoza 50018, Spain
[2] Swiss Fed Inst Technol, Mech & Proc Engn, CH-8090 Zurich, Switzerland
[3] Univ Zaragoza, Inst Invest Ingn Aragon, Zaragoza 50018, Spain
关键词
Vehicle dynamics; Simultaneous localization and mapping; Task analysis; Semantics; Dynamics; Deep learning; Gallium nitride; Visual SLAM; Inpainting; Dynamic objects; GANs; IMAGE;
D O I
10.1109/TRO.2020.3031267
中图分类号
TP24 [机器人技术];
学科分类号
080202 ; 1405 ;
摘要
In this article, we present a data-driven approach to obtain the static image of a scene, eliminating dynamic objects that might have been present at the time of traversing the scene with a camera. The general objective is to improve vision-based localization and mapping tasks in dynamic environments, where the presence (or absence) of different dynamic objects in different moments makes these tasks less robust. We introduce an end-to-end deep learning framework to turn images of an urban environment that include dynamic content, such as vehicles or pedestrians, into realistic static frames suitable for localization and mapping. This objective faces two main challenges: detecting the dynamic objects, and inpainting the static occluded background. The first challenge is addressed by the use of a convolutional network that learns a multiclass semantic segmentation of the image. The second challenge is approached with a generative adversarial model that, taking as input the original dynamic image and the computed dynamic/static binary mask, is capable of generating the final static image. This framework makes use of two new losses, one based on image steganalysis techniques, useful to improve the inpainting quality, and another one based on ORB features, designed to enhance feature matching between real and hallucinated image regions. To validate our approach, we perform an extensive evaluation on different tasks that are affected by dynamic entities, i.e.,visual odometry, place recognition, and multiview stereo, with the hallucinated images. Code has been made available on https://github.com/bertabescos/EmptyCities_SLAM.
引用
收藏
页码:433 / 451
页数:19
相关论文
共 64 条
[1]   Sequential Non-Rigid Structure from Motion Using Physical Priors [J].
Agudo, Antonio ;
Moreno-Noguer, Francesc ;
Calvo, Begona ;
Montiel, J. M. M. .
IEEE TRANSACTIONS ON PATTERN ANALYSIS AND MACHINE INTELLIGENCE, 2016, 38 (05) :979-994
[2]  
Alcantarilla PF, 2012, IEEE INT CONF ROBOT, P1290, DOI 10.1109/ICRA.2012.6224690
[3]  
[Anonymous], 2018, ARXIV180702996
[4]  
Arandjelovic R, 2018, IEEE T PATTERN ANAL, V40, P1437, DOI [10.1109/CVPR.2016.572, 10.1109/TPAMI.2017.2711011]
[5]  
Arroyo R., 2017, ERFNET
[6]  
Barnes D, 2018, IEEE INT CONF ROBOT, P1894
[7]   Speeded-Up Robust Features (SURF) [J].
Bay, Herbert ;
Ess, Andreas ;
Tuytelaars, Tinne ;
Van Gool, Luc .
COMPUTER VISION AND IMAGE UNDERSTANDING, 2008, 110 (03) :346-359
[8]  
Bertalmío M, 2001, PROC CVPR IEEE, P355
[9]  
Bescos B, 2019, IEEE INT CONF ROBOT, P5460, DOI [10.1109/icra.2019.8794417, 10.1109/ICRA.2019.8794417]
[10]   DynaSLAM: Tracking, Mapping, and Inpainting in Dynamic Scenes [J].
Bescos, Berta ;
Facil, Jose M. ;
Civera, Javier ;
Neira, Jose .
IEEE ROBOTICS AND AUTOMATION LETTERS, 2018, 3 (04) :4076-4083