NavTr: Object-Goal Navigation With Learnable Transformer Queries

被引:0
作者
Mao, Qiuyu [1 ]
Wang, Jikai [1 ]
Xu, Meng [1 ]
Chen, Zonghai [1 ]
机构
[1] Univ Sci & Technol China, Dept Automat, Hefei 230026, Peoples R China
来源
IEEE ROBOTICS AND AUTOMATION LETTERS | 2024年 / 9卷 / 12期
基金
中国国家自然科学基金;
关键词
Navigation; Transformers; Semantics; Visualization; Vectors; Three-dimensional displays; Long short term memory; Encoding; Computer architecture; Aggregates; Vision-based navigation; representation learning; reinforcement learning;
D O I
10.1109/LRA.2024.3497718
中图分类号
TP24 [机器人技术];
学科分类号
080202 ; 1405 ;
摘要
This letter introduces Navigation Transformer (NavTr), a novel framework for object-goal navigation using Transformer queries to enhance the learning and representation of environment states. By integrating semantic information, object positions, and neighborhood information, NavTr creates a unified, comprehensive, and extensible state representation for the object-goal navigating task. In the framework, the Transformer queries implicitly learn inter-object relationships, which facilitates high-level understanding of the environment. Additionally, NavTr implements target-oriented supervisory signals, such as rotation rewards and spatial loss, which improve exploration efficiency in the reinforcement learning framework. NavTr outperforms popular graph-based and Attention-based methods by a large margin in terms of success rate (SR) and success weighted by path length (SPL). Extensive experiments on the AI2-THOR dataset demonstrate the effectiveness of our approach.
引用
收藏
页码:11738 / 11745
页数:8
相关论文
共 27 条
  • [1] Anderson P, 2018, Arxiv, DOI [arXiv:1807.06757, DOI 10.48550/ARXIV.1807.06757]
  • [2] Carion Nicolas, 2020, Computer Vision - ECCV 2020. 16th European Conference. Proceedings. Lecture Notes in Computer Science (LNCS 12346), P213, DOI 10.1007/978-3-030-58452-8_13
  • [3] Chaplot DS, 2020, ADV NEUR IN, V33
  • [4] Deitke M, 2022, Arxiv, DOI arXiv:2210.06849
  • [5] Visual Object Search by Learning Spatial Context
    Druon, Raphael
    Yoshiyasu, Yusuke
    Kanezaki, Asako
    Watt, Alassane
    [J]. IEEE ROBOTICS AND AUTOMATION LETTERS, 2020, 5 (02) : 1279 - 1286
  • [6] Du H., 2021, P INT C LEARN REPR, P12
  • [7] Object Memory Transformer for Object Goal Navigation
    Fukushima, Rui
    Ota, Kei
    Kanezaki, Asako
    Sasaki, Yoko
    Yoshiyasu, Yusuke
    [J]. 2022 IEEE INTERNATIONAL CONFERENCE ON ROBOTICS AND AUTOMATION, ICRA 2022, 2022, : 11288 - 11294
  • [8] Deep Residual Learning for Image Recognition
    He, Kaiming
    Zhang, Xiangyu
    Ren, Shaoqing
    Sun, Jian
    [J]. 2016 IEEE CONFERENCE ON COMPUTER VISION AND PATTERN RECOGNITION (CVPR), 2016, : 770 - 778
  • [9] Johnson J, 2015, PROC CVPR IEEE, P3668, DOI 10.1109/CVPR.2015.7298990
  • [10] Kim Nuri, 2023, P C ROBOT LEARNING C, P393