Deep Reinforcement Learning Visual Target Navigation Method Based on Attention Mechanism and Reward Shaping

被引：1

作者：

Meng, Yiyue ^{[1
]}

Guo, Chi ^{[2
]}

Liu, Jingnan ^{[1
]}

机构：

[1] GNSS Research Center, Wuhan University, Wuhan

[2] Hubei Luojia Laboratory, Wuhan

来源：

Wuhan Daxue Xuebao (Xinxi Kexue Ban)/Geomatics and Information Science of Wuhan University | 2024年 / 49卷 / 07期

关键词：

attention mecha‑ nism; deep reinforcement learning; reward shaping; visual navigation; visual target navigation;

D O I：

10.13203/j.whugis20230193

中图分类号：

学科分类号：

摘要：

Objectives: As one of the important tasks of visual navigation, visual target navigation requires the agent to explore and navigate to the target and issue the done action only relying on visual image infor‑ mation and target information. Presently, the existing methods usually adopt deep reinforcement learning framework to solve visual target navigation problems. However, there are still some shortcomings: (1) The existing methods ignore the relationship between the state of the current and previous time step, resulting in poor navigation performance. (2) The reward settings of the existing methods are fixed and sparse. The agents cannot obtain better navigation strategies under sparse reward. To solve these problems, we propose a deep reinforcement learning visual target navigation method based on attention mechanism and reward shaping. This method can further improve the performance of visual target navigation tasks. Methods: First, the method obtains the area of path focused by the agent at the previous time step based on scaled dot production attention between previous visual image and action. Then, the method obtains the area of path focused by the agent at current time step based on scaled dot production attention between current visual image and previous focused area of path to introduce the state relationship. Besides, to obtain the current focused area of target, we also utilize scaled dot production attention mechanism. We concatenate the current focused area of path and target to build a better state of the agent. Additionally, we propose a reward reshaping rule to solve the problem of sparse reward and apply the cosine similarity between the visual image and target to automatically build a reward space with target preference. Finally, the attention method and reward reshap‑ ing method are combined together to form the deep reinforcement learning visual target navigation method based on attention mechanism and reward shaping. Results: We conduct experiments on AI2-THOR dataset and use success rate (SR) and success weighted by path length (SPL) to evaluate the performance of visual target navigation methods. The results indicate that our method shows 7% improvement in SR and 20% in SPL, which means that the agent can learn a better navigation strategy. In addition, the ablation study shows that the introduction of state relationship and reward shaping can both improve the navigation perfor‑ mance. Conclusions: To draw a conclusion, the proposed deep reinforcement learning visual target naviga‑ tion method based on attention mechanism and reward shaping can further improve the navigation success rate and efficiency by building better states and reward space. © 2024 Editorial Department of Geomatics and Information Science of Wuhan University. All rights reserved.

引用

页码：1100 / 1108and1119

共 33 条

[1]

Jianchi Liao, Xingxing Li, Shaoquan Feng, GVIL：Tightly-Coupled GNSS PPP/Visual/INS/LiDAR SLAM Based on Graph Optimization［J］, Geomatics and Information Science of Wuhan University, 48, 7, pp. 1204-1215, (2023)

[2]

Chi Guo, Binhan Luo, Fei Li, Et al., Review and Verification for Brain-Like Navigation Algorithm ［J］, Geomatics and Information Science of Wuhan University, 46, 12, pp. 1819-1831, (2021)

[3]

Xiao Lu, Yiwei Zhu, Muhua Yang, Et al., Reinforce‑ ment Learning Based End-to-End Autonomous Driving Decision-Making Method by Combining Image and Monocular Depth Features［J］, Geomatics and Information Science of Wuhan University, 46, 12, pp. 1862-1871, (2021)

[4]

Weifeng Zhao, Qingquan Li, Bijun Li, Spatial Cog‑ nition Oriented Optimal Route Planning with Hierar‑ chical Reinforcement Learning［J］, Geomatics and Information Science of Wuhan University, 37, 11, pp. 1271-1275, (2012)

[5]

Zhu Y K, ，Mottaghi R，Kolve E，et al. Target-Driven Visual Navigation in Indoor Scenes Using Deep Reinforcement Learning ［C］, IEEE International Conference on Robotics and Automation, (2017)

[6]

Anderson P, Et al., Vision-and-Language Navigation：Interpreting Visually-Grounded Navigation Instructions in Real Environments［C］, IEEE/CVF Conference on Computer Vision and Pattern Recognition, (2018)

[7]

［7］ Thomason J，Murray M，Cakmak M，et al. Vision- and-Dialog Navigation［C］, Conference on Robot Learning, (2019)

[8]

Finn C，, Abbeel P，, Levine S., Model-Agnostic Me‑ ta-Learning for Fast Adaptation of Deep Networks ［C］, The 34th International Conference on Ma‑ chine Learning, (2017)

[9]

Redmon J, Et al., You only Look Once：Unified，Real-Time Object Detection ［C］, IEEE Conference on Computer Vision and Pattern Recognition, (2016)

[10]

Kipf T N，, Welling M., Semi-supervised Classifica‑ tion with Graph Convolutional Networks, Inter‑ national Conference on Learning Representations, (2017)

← 1 2 3 4 →