Adversarial Reinforcement Learning for Enhanced Decision-Making of Evacuation Guidance Robots in Intelligent Fire Scenarios

被引：0

作者：

Zhao, Hantao ^{[1
,2
]}

Liang, Zhihao ^{[1
]}

Ma, Tianxing ^{[3
]}

Shi, Xiaomeng ^{[4
]}

Kapadia, Mubbasir ^{[5
]}

Thrash, Tyler ^{[6
]}

Hoelscher, Christoph ^{[7
]}

Jia, Jinyuan ^{[8
,9
]}

Liu, Bo ^{[2
,10
]}

Cao, Jiuxin ^{[1
,2
]}

机构：

[1] Southeast Univ, Sch Cyber Sci & Engn, Nanjing 211189, Peoples R China

[2] Purple Mt Labs, Nanjing 211111, Peoples R China

[3] Chinese Acad Sci, Inst Informat Engn, Beijing 100085, Peoples R China

[4] Southeast Univ, Sch Transportat, Nanjing 211189, Peoples R China

[5] Rutgers State Univ, Dept Comp Sci, Newark, NJ 07102 USA

[6] St Louis Univ, Dept Biol, St Louis, MO 63103 USA

[7] Swiss Fed Inst Technol, Chair Cognit Sci, CH-8092 Zurich, Switzerland

[8] Hong Kong Univ Sci & Technol, Guangzhou 511453, Peoples R China

[9] Jilin Animat Inst, Game Sch, Changchun 130012, Peoples R China

[10] Southeast Univ, Sch Comp Sci & Engn, Nanjing 211189, Peoples R China

来源：

IEEE TRANSACTIONS ON COMPUTATIONAL SOCIAL SYSTEMS | 2024年

基金：

中国国家自然科学基金;

关键词：

Robots; Training; Adaptation models; Decision making; Behavioral sciences; Heuristic algorithms; Reinforcement learning; Robustness; Computational modeling; Microscopy; Adversarial reinforcement learning (ARL); human-robot interaction; multiagent reinforcement learning (MARL); simulation frameworks; NAVIGATION;

D O I：

10.1109/TCSS.2024.3502420

中图分类号：

TP3 [计算技术、计算机技术];

学科分类号：

0812 ;

摘要：

In the context of rapid urbanization, traditional manual guidance and static evacuation signs are increasingly inadequate for addressing complex and dynamic emergencies. This study proposes an innovative emergency evacuation framework that optimizes the crowd evacuation by integrating multiagent reinforcement learning (MARL) with adversarial reinforcement learning (ARL). The developed simulation environment models realistic human behavior in complex buildings and incorporates robotic navigation and intelligent path planning. A novel simulated human behavior model was integrated, capable of complex human--robot interaction, independent escape route searching, and exhibiting herd mentality and memory mechanisms. We also proposed a multiagent framework that combines MARL and ARL to enhance overall evacuation efficiency and robustness. Additionally, we developed a new ARL evaluation framework that provides a novel method for quantifying agents' performance. Various experiments of differing difficulty levels were conducted, and the results demonstrate that the proposed framework exhibits advantages in emergency evacuation scenarios. Specifically, our ARLR approach increased survival rates by 1.8% points in low-difficulty evacuation tasks compared to the RLR approach using only MARL algorithms. In high-difficulty evacuation tasks, the ARLR approach raised survival rates from 46.7% without robots to 64.4%, exceeding the RLR approach by 1.7% points. This study aims to enhance the efficiency and safety of human-robot collaborative fire evacuations and provides theoretical support for evaluating and improving the performance and robustness of ARL agents.

引用

页数：17

共 61 条

[1] Deep Reinforcement Learning in Agents' Training: Unity ML-Agents [J].

Almon-Manzano, Laura ;

Pastor-Vargas, Rafael ;

Troncoso, Jose Manuel Cuadra .

BIO-INSPIRED SYSTEMS AND APPLICATIONS: FROM ROBOTICS TO AMBIENT INTELLIGENCE, PT II, 2022, 13259 :391-400

[2]

Angel AS, 2017, INT CONF COMPUT POW, P13, DOI 10.1109/ICCPEIC.2017.8290331

[3] A new Potential-Based Reward Shaping for Reinforcement Learning Agent [J].

Badnava, Babak ;

Esmaeili, Mona ;

Mozayani, Nasser ;

Zarkesh-Ha, Payman .

2023 IEEE 13TH ANNUAL COMPUTING AND COMMUNICATION WORKSHOP AND CONFERENCE, CCWC, 2023, :630-635

[4]

Balduzzi D, 2019, PR MACH LEARN RES, V97

[5]

Baudoin Y., 2009, P IEEE INT WORKSH SA, P1

[6] Learning to Generate Levels From Nothing [J].

Bontrager, Philip ;

Togelius, Julian .

2021 IEEE CONFERENCE ON GAMES (COG), 2021, :760-767

[7] Robot Guided Crowd Evacuation [J].

Boukas, Evangelos ;

Kostavelis, Ioannis ;

Gasteratos, Antonios ;

Sirakoulis, Georgios Ch .

IEEE TRANSACTIONS ON AUTOMATION SCIENCE AND ENGINEERING, 2015, 12 (02) :739-751

[8]

Butail S, 2015, IEEE INT C INT ROBOT, P2413, DOI 10.1109/IROS.2015.7353704

[9]

Chen CG, 2019, IEEE INT CONF ROBOT, P6015, DOI [10.1109/ICRA.2019.8794134, 10.1109/icra.2019.8794134]

[10] Option-Aware Adversarial Inverse Reinforcement Learning for Robotic Control [J].

Chen, Jiayu ;

Lan, Tian ;

Aggarwal, Vaneet .

2023 IEEE INTERNATIONAL CONFERENCE ON ROBOTICS AND AUTOMATION, ICRA, 2023, :5902-5908

← 1 2 3 4 5 6 7 →