Interactive Semantic Map Representation for Skill-Based Visual Object Navigation

被引:0
作者
Zemskova, Tatiana [1 ,2 ]
Staroverov, Aleksei [1 ]
Muravyev, Kirill [3 ]
Yudin, Dmitry A. [1 ,2 ]
Panov, Aleksandr I. [1 ,2 ,3 ]
机构
[1] Artificial Intelligence Res Inst AIRI, Moscow 121170, Russia
[2] Moscow Inst Phys & Technol, Dolgoprudnyi 141701, Russia
[3] Fed Res Ctr Comp Sci & Control, Moscow 117312, Russia
来源
IEEE ACCESS | 2024年 / 12卷
基金
俄罗斯科学基金会;
关键词
Semantics; Navigation; Task analysis; Robots; Visualization; Planning; Habitats; Reinforcement learning; Mobile robots; Interactive systems; Semantic map; navigation; robotics; reinforcement learning; frontier-based exploration;
D O I
10.1109/ACCESS.2024.3380450
中图分类号
TP [自动化技术、计算机技术];
学科分类号
0812 ;
摘要
Visual object navigation is one of the key tasks in mobile robotics. One of the most important components of this task is the accurate semantic representation of the scene, which is needed to determine and reach a goal object. This paper introduces a new representation of a scene semantic map formed during the embodied agent interaction with the indoor environment. It is based on a neural network method that adjusts the weights of the segmentation model with backpropagation of the predicted fusion loss values during inference on a regular (backward) or delayed (forward) image sequence. We implement this representation into a full-fledged navigation approach called SkillTron. The method can select robot skills from end-to-end policies based on reinforcement learning and classic map-based planning methods. The proposed approach makes it possible to form both intermediate goals for robot exploration and the final goal for object navigation. We conduct intensive experiments with the proposed approach in the Habitat environment, demonstrating its significant superiority over state-of-the-art approaches in terms of navigation quality metrics. The developed code and custom datasets are publicly available at github.com/AIRI-Institute/ skill-fusion.
引用
收藏
页码:44628 / 44639
页数:12
相关论文
共 40 条
  • [1] Semantic OcTree Mapping and Shannon Mutual Information Computation for Robot Exploration
    Asgharivaskasi, Arash
    Atanasov, Nikolay
    [J]. IEEE TRANSACTIONS ON ROBOTICS, 2023, 39 (03) : 1910 - 1928
  • [2] Batra D, 2020, Arxiv, DOI [arXiv:2006.13171, DOI 10.48550/ARXIV.2006.13171]
  • [3] Carion Nicolas, 2020, Computer Vision - ECCV 2020. 16th European Conference. Proceedings. Lecture Notes in Computer Science (LNCS 12346), P213, DOI 10.1007/978-3-030-58452-8_13
  • [4] Chaplot Devendra Singh, 2020, ADV NEUR IN, V33
  • [5] Chaudhary L., 2023, IEEE INT C ACOUST SP, P1
  • [6] Ding N., 2023, arXiv
  • [7] Dosovitskiy Alexey, 2017, Conference on robot learning, P1
  • [8] These Maps are Made for Walking: Real-Time Terrain Property Estimation for Mobile Robots
    Ewen, Parker
    Li, Adam
    Chen, Yuxin
    Hong, Steven
    Vasudevan, Ram
    [J]. IEEE ROBOTICS AND AUTOMATION LETTERS, 2022, 7 (03): : 7083 - 7090
  • [9] Fang ZY, 2021, Arxiv, DOI arXiv:2012.00057
  • [10] CoWs on PASTURE: Baselines and Benchmarks for Language-Driven Zero-Shot Object Navigation
    Gadre, Sarnir Yitzhak
    Wortsman, Mitchell
    Ilharco, Gabriel
    Schmidt, Ludwig
    Song, Shuran
    [J]. 2023 IEEE/CVF CONFERENCE ON COMPUTER VISION AND PATTERN RECOGNITION (CVPR), 2023, : 23171 - 23181