Towards Generalization in Target-Driven Visual Navigation by Using Deep Reinforcement Learning

被引:67
作者
Devo, Alessandro [1 ]
Mezzetti, Giacomo [1 ]
Costante, Gabriele [1 ]
Fravolini, Mario L. [1 ]
Valigi, Paolo [1 ]
机构
[1] Univ Perugia, Dept Engn, I-06125 Perugia, Italy
关键词
Navigation; Visualization; Task analysis; Training; Machine learning; Simultaneous localization and mapping; Deep learning in robotics and automation; target-driven visual navigation; visual-based navigation; visual learning; SIMULTANEOUS LOCALIZATION; OBSTACLE AVOIDANCE;
D O I
10.1109/TRO.2020.2994002
中图分类号
TP24 [机器人技术];
学科分类号
080202 ; 1405 ;
摘要
Among the main challenges in robotics, target-driven visual navigation has gained increasing interest in recent years. In this task, an agent has to navigate in an environment to reach a user specified target, only through vision. Recent fruitful approaches rely on deep reinforcement learning, which has proven to be an effective framework to learn navigation policies. However, current state-of-the-art methods require to retrain, or at least fine-tune, the model for every new environment and object. In real scenarios, this operation can be extremely challenging or even dangerous. For these reasons, we address generalization in target-driven visual navigation by proposing a novel architecture composed of two networks, both exclusively trained in simulation. The first one has the objective of exploring the environment, while the other one of locating the target. They are specifically designed to work together, while separately trained to help generalization. In this article, we test our agent in both simulated and real scenarios, and validate its generalization capabilities through extensive experiments with previously unseen goals and unknown mazes, even much larger than the ones used for training.
引用
收藏
页码:1546 / 1561
页数:16
相关论文
共 54 条
  • [11] Bruce J., 2017, ARXIV171110137
  • [12] Past, Present, and Future of Simultaneous Localization and Mapping: Toward the Robust-Perception Age
    Cadena, Cesar
    Carlone, Luca
    Carrillo, Henry
    Latif, Yasir
    Scaramuzza, Davide
    Neira, Jose
    Reid, Ian
    Leonard, John J.
    [J]. IEEE TRANSACTIONS ON ROBOTICS, 2016, 32 (06) : 1309 - 1332
  • [13] Grad-CAM plus plus : Generalized Gradient-based Visual Explanations for Deep Convolutional Networks
    Chattopadhay, Aditya
    Sarkar, Anirban
    Howlader, Prantik
    Balasubramanian, Vineeth N.
    [J]. 2018 IEEE WINTER CONFERENCE ON APPLICATIONS OF COMPUTER VISION (WACV 2018), 2018, : 839 - 847
  • [14] Deep Reinforcement Learning for Instruction Following Visual Navigation in 3D Maze-Like Environments
    Devo, Alessandro
    Costante, Gabriele
    Valigi, Paolo
    [J]. IEEE ROBOTICS AND AUTOMATION LETTERS, 2020, 5 (02) : 1175 - 1182
  • [15] A solution to the simultaneous localization and map building (SLAM) problem
    Dissanayake, MWMG
    Newman, P
    Clark, S
    Durrant-Whyte, HF
    Csorba, M
    [J]. IEEE TRANSACTIONS ON ROBOTICS AND AUTOMATION, 2001, 17 (03): : 229 - 241
  • [16] Espeholt L, 2018, PR MACH LEARN RES, V80
  • [17] Visual simultaneous localization and mapping: a survey
    Fuentes-Pacheco, Jorge
    Ruiz-Ascencio, Jose
    Manuel Rendon-Mancha, Juan
    [J]. ARTIFICIAL INTELLIGENCE REVIEW, 2015, 43 (01) : 55 - 81
  • [18] Deep Image Retrieval: Learning Global Representations for Image Search
    Gordo, Albert
    Almazan, Jon
    Revaud, Jerome
    Larlus, Diane
    [J]. COMPUTER VISION - ECCV 2016, PT VI, 2016, 9910 : 241 - 257
  • [19] Greydanus S, 2018, PR MACH LEARN RES, V80
  • [20] Cognitive Mapping and Planning for Visual Navigation
    Gupta, Saurabh
    Davidson, James
    Levine, Sergey
    Sukthankar, Rahul
    Malik, Jitendra
    [J]. 30TH IEEE CONFERENCE ON COMPUTER VISION AND PATTERN RECOGNITION (CVPR 2017), 2017, : 7272 - 7281