Learning to Navigate in Human Environments via Deep Reinforcement Learning

被引：1

作者：

Gao, Xingyuan ^{[1
,2
]}

Sun, Shiying ^{[1
]}

Zhao, Xiaoguang ^{[1
]}

Tan, Min ^{[1
]}

机构：

[1] Chinese Acad Sci, Inst Automat, State Key Lab Management & Control Complex Syst, Beijing, Peoples R China

[2] Univ Chinese Acad Sci, Beijing, Peoples R China

来源：

NEURAL INFORMATION PROCESSING (ICONIP 2019), PT I | 2019年 / 11953卷

基金：

中国国家自然科学基金;

关键词：

Socially normative navigation; Reinforcement learning;

D O I：

10.1007/978-3-030-36708-4_34

中图分类号：

TP18 [人工智能理论];

学科分类号：

081104 ; 0812 ; 0835 ; 1405 ;

摘要：

Mobile robots have been widely applied in human populated environments. To interact with humans, the robots require the capacity to navigate safely and efficiently in complex environments. Recent works have successfully applied reinforcement learning to learn socially normative navigation behaviors. However, they mostly focus on modeling human-robot cooperations and neglect complex interactions between pedestrians. In addition, these methods are implemented using assumptions of perfect sensing about the states of pedestrians, which makes the model less robust to the perception uncertainty. This work presents a novel algorithm to learn an efficient navigation policy that exhibits socially normative navigation behaviors. We propose to employ convolutional social pooling to jointly capture human-robot cooperations and inter-human interactions in an actor-critic reinforcement learning framework. In addition, we propose to focus on partial observability in socially normative navigation. Our model is capable to learn the representation of unobservable states with recurrent neural networks and further improves the stability of the algorithm. Experimental results show that the proposed learning algorithm enables robots to learn socially normative navigation behaviors and achieves a better performance than state-of-the-art methods.

引用

页码：418 / 429

页数：12

共 22 条

[1]

Abadi M, 2016, PROCEEDINGS OF OSDI'16: 12TH USENIX SYMPOSIUM ON OPERATING SYSTEMS DESIGN AND IMPLEMENTATION, P265

[2]

[Anonymous], 2012, ROBOTICS SCI SYSTEMS

[3] Probabilistically safe motion planning to avoid dynamic obstacles with uncertain motion patterns [J].

Aoude, Georges S. ;

Luders, Brandon D. ;

Joseph, Joshua M. ;

Roy, Nicholas ;

How, Jonathan P. .

AUTONOMOUS ROBOTS, 2013, 35 (01) :51-76

[4]

Chen CG, 2019, IEEE INT CONF ROBOT, P6015, DOI [10.1109/ICRA.2019.8794134, 10.1109/icra.2019.8794134]

[5]

Chen YF, 2017, IEEE INT C INT ROBOT, P1343, DOI 10.1109/IROS.2017.8202312

[6]

Cho Kyunghyun, 2014, C EMPIRICAL METHODS, P1724

[7] Convolutional Social Pooling for Vehicle Trajectory Prediction [J].

Deo, Nachiket ;

Trivedi, Mohan M. .

PROCEEDINGS 2018 IEEE/CVF CONFERENCE ON COMPUTER VISION AND PATTERN RECOGNITION WORKSHOPS (CVPRW), 2018, :1549-1557

[8]

Everett M, 2018, IEEE INT C INT ROBOT, P3052, DOI 10.1109/IROS.2018.8593871

[9]

Heess N., 2015, ARXIV151204455CS

[10]

Heess Nicolas, 2017, ABS170702286 CORR

← 1 2 3 →