Semantic Policy Network for Zero-Shot Object Goal Visual Navigation

被引：3

作者：

Zhao, Qianfan ^{[1
,2
]}

Zhang, Lu ^{[1
,2
]}

He, Bin ^{[3
]}

Liu, Zhiyong ^{[1
,2
]}

机构：

[1] Chinese Acad Sci, Inst Automat, State Key Lab Multimodel Artificial Intelligence S, Beijing 100190, Peoples R China

[2] Univ Chinese Acad Sci, Sch Artificial Intelligence, Beijing 100190, Peoples R China

[3] Tongji Univ, Coll Elect & Informat Engn, Shanghai 200070, Peoples R China

来源：

IEEE ROBOTICS AND AUTOMATION LETTERS | 2023年 / 8卷 / 11期

关键词：

Deep learning; path planning; reinforcement learning; vision-based navigation;

D O I：

10.1109/LRA.2023.3320014

中图分类号：

TP24 [机器人技术];

学科分类号：

080202 ; 1405 ;

摘要：

The task of zero-shot object goal visual navigation (ZSON) aims to enable robots to locate previously "unseen" objects by visual observations. This task presents a significant challenge since the robot must transfer the navigation policy learned from "seen" objects to "unseen" objects through auxiliary semantic information without training samples, a process known as zero-shot learning. In order to address this challenge, we propose a novel approach termed the Semantic Policy Network (SPNet). The SPNet consists of two modules that are deeply integrated with semantic embeddings: the Semantic Actor Policy (SAP) module and the Semantic Trajectory (ST) module. The SAP module generates actor network weight bias based on semantic embeddings, creating unique navigation policies for different target classes. The ST module records the robot's actions, visual features, and semantic embeddings at each step, and aggregates information in both the spatial and temporal dimensions. To evaluate our approach, we conducted extensive experiments using MP3D dataset, HM3D dataset, and RoboTHOR. Experimental results indicate that the proposed method outperforms other ZSON methods for both seen and unseen target classes.

引用

页码：7655 / 7662

页数：8

共 50 条

[41] Discriminative Latent Visual Space For Zero-Shot Object Classification
Roy, Abhinaba
Banerjee, Biplab
Murino, Vittorio
2018 24TH INTERNATIONAL CONFERENCE ON PATTERN RECOGNITION (ICPR), 2018, : 2552 - 2557
[42] Zero-Shot Aerial Object Detection with Visual Description Regularization
Zang, Zhengqing
Lin, Chenyu
Tang, Chenwei
Wang, Tao
Lv, Jiancheng
THIRTY-EIGHTH AAAI CONFERENCE ON ARTIFICIAL INTELLIGENCE, VOL 38 NO 7, 2024, : 6926 - 6934
[43] Visual Language Based Succinct Zero-Shot Object Detection
Zheng, Ye
Huang, Xi
Cui, Li
PROCEEDINGS OF THE 29TH ACM INTERNATIONAL CONFERENCE ON MULTIMEDIA, MM 2021, 2021, : 5410 - 5418
[44] Feature Enhanced Projection Network for Zero-shot Semantic Segmentation
Lu, Hongchao
Fang, Longwei
Lin, Matthieu
Deng, Zhidong
2021 IEEE INTERNATIONAL CONFERENCE ON ROBOTICS AND AUTOMATION (ICRA 2021), 2021, : 14011 - 14017
[45] MSDN: Mutually Semantic Distillation Network for Zero-Shot Learning
Chen, Shiming
Hong, Ziming
Xie, Guo-Sen
Yang, Wenhan
Peng, Qinmu
Wang, Kai
Zhao, Jian
You, Xinge
2022 IEEE/CVF CONFERENCE ON COMPUTER VISION AND PATTERN RECOGNITION (CVPR), 2022, : 7602 - 7611
[46] Graph-Based Visual-Semantic Entanglement Network for Zero-Shot Image Recognition
Hu, Yang
Wen, Guihua
Chapman, Adriane
Yang, Pei
Luo, Mingnan
Xu, Yingxue
Dai, Dan
Hall, Wendy
IEEE TRANSACTIONS ON MULTIMEDIA, 2022, 24 : 2473 - 2487
[47] Zero-Shot Visual Imitation
Pathak, Deepak
Mahmoudieh, Parsa
Luo, Guanghao
Agrawal, Pulkit
Chen, Dian
Shentu, Fred
Shelhamer, Evan
Malik, Jitendra
Efros, Alexei A.
Darrell, Trevor
PROCEEDINGS 2018 IEEE/CVF CONFERENCE ON COMPUTER VISION AND PATTERN RECOGNITION WORKSHOPS (CVPRW), 2018, : 2131 - 2134
[48] GTNet: Generative Transfer Network for Zero-Shot Object Detection
Zhao, Shizhen
Gao, Changxin
Shao, Yuanjie
Li, Lerenhan
Yu, Changqian
Ji, Zhong
Sang, Nang
THIRTY-FOURTH AAAI CONFERENCE ON ARTIFICIAL INTELLIGENCE, THE THIRTY-SECOND INNOVATIVE APPLICATIONS OF ARTIFICIAL INTELLIGENCE CONFERENCE AND THE TENTH AAAI SYMPOSIUM ON EDUCATIONAL ADVANCES IN ARTIFICIAL INTELLIGENCE, 2020, 34 : 12967 - 12974
[49] Zero-shot image classification via Visual–Semantic Feature Decoupling
Xin Sun
Yu Tian
Haojie Li
Multimedia Systems, 2024, 30
[50] Semantic-visual shared knowledge graph for zero-shot learning
Yu, Beibei
Xie, Cheng
Tang, Peng
Li, Bin
PEERJ COMPUTER SCIENCE, 2023, 9

← 1 2 3 4 5 →