Adversarial Reinforcement Learning for Enhanced Decision-Making of Evacuation Guidance Robots in Intelligent Fire Scenarios

被引：0

作者：

Zhao, Hantao ^{[1
,2
]}

Liang, Zhihao ^{[1
]}

Ma, Tianxing ^{[3
]}

Shi, Xiaomeng ^{[4
]}

Kapadia, Mubbasir ^{[5
]}

Thrash, Tyler ^{[6
]}

Hoelscher, Christoph ^{[7
]}

Jia, Jinyuan ^{[8
,9
]}

Liu, Bo ^{[2
,10
]}

Cao, Jiuxin ^{[1
,2
]}

机构：

[1] Southeast Univ, Sch Cyber Sci & Engn, Nanjing 211189, Peoples R China

[2] Purple Mt Labs, Nanjing 211111, Peoples R China

[3] Chinese Acad Sci, Inst Informat Engn, Beijing 100085, Peoples R China

[4] Southeast Univ, Sch Transportat, Nanjing 211189, Peoples R China

[5] Rutgers State Univ, Dept Comp Sci, Newark, NJ 07102 USA

[6] St Louis Univ, Dept Biol, St Louis, MO 63103 USA

[7] Swiss Fed Inst Technol, Chair Cognit Sci, CH-8092 Zurich, Switzerland

[8] Hong Kong Univ Sci & Technol, Guangzhou 511453, Peoples R China

[9] Jilin Animat Inst, Game Sch, Changchun 130012, Peoples R China

[10] Southeast Univ, Sch Comp Sci & Engn, Nanjing 211189, Peoples R China

来源：

IEEE TRANSACTIONS ON COMPUTATIONAL SOCIAL SYSTEMS | 2024年

基金：

中国国家自然科学基金;

关键词：

Robots; Training; Adaptation models; Decision making; Behavioral sciences; Heuristic algorithms; Reinforcement learning; Robustness; Computational modeling; Microscopy; Adversarial reinforcement learning (ARL); human-robot interaction; multiagent reinforcement learning (MARL); simulation frameworks; NAVIGATION;

D O I：

10.1109/TCSS.2024.3502420

中图分类号：

TP3 [计算技术、计算机技术];

学科分类号：

0812 ;

摘要：

In the context of rapid urbanization, traditional manual guidance and static evacuation signs are increasingly inadequate for addressing complex and dynamic emergencies. This study proposes an innovative emergency evacuation framework that optimizes the crowd evacuation by integrating multiagent reinforcement learning (MARL) with adversarial reinforcement learning (ARL). The developed simulation environment models realistic human behavior in complex buildings and incorporates robotic navigation and intelligent path planning. A novel simulated human behavior model was integrated, capable of complex human--robot interaction, independent escape route searching, and exhibiting herd mentality and memory mechanisms. We also proposed a multiagent framework that combines MARL and ARL to enhance overall evacuation efficiency and robustness. Additionally, we developed a new ARL evaluation framework that provides a novel method for quantifying agents' performance. Various experiments of differing difficulty levels were conducted, and the results demonstrate that the proposed framework exhibits advantages in emergency evacuation scenarios. Specifically, our ARLR approach increased survival rates by 1.8% points in low-difficulty evacuation tasks compared to the RLR approach using only MARL algorithms. In high-difficulty evacuation tasks, the ARLR approach raised survival rates from 46.7% without robots to 64.4%, exceeding the RLR approach by 1.7% points. This study aims to enhance the efficiency and safety of human-robot collaborative fire evacuations and provides theoretical support for evaluating and improving the performance and robustness of ARL agents.

引用

页数：17

共 61 条

[11] An enhanced model for evacuation vulnerability assessment in urban areas [J].

Chen, Jie ;

Pei, Tao ;

Li, Mingxiao ;

Song, Ci ;

Ma, Ting ;

Lu, Feng ;

Shaw, Shih-Lung .

COMPUTERS ENVIRONMENT AND URBAN SYSTEMS, 2020, 84

[12] Agent-based modelling and simulation of urban evacuation: relative effectiveness of simultaneous and staged evacuation strategies [J].

Chen, X. ;

Zhan, F. B. .

JOURNAL OF THE OPERATIONAL RESEARCH SOCIETY, 2008, 59 (01) :25-33

[13]

DAEILL KIM, 2020, [Journal of The Korean Society of Disaster Information, 한국재난정보학회 논문집], V16, P96, DOI 10.15683/kosdi.2020.3.31.096

[14] Soft Actor-Critic for Navigation of Mobile Robots [J].

de Jesus, Junior Costa ;

Kich, Victor Augusto ;

Kolling, Alisson Henrique ;

Grando, Ricardo Bedin ;

Cuadros, Marco Antonio de Souza Leite ;

Gamarra, Daniel Fernando Tello .

JOURNAL OF INTELLIGENT & ROBOTIC SYSTEMS, 2021, 102 (02)

[15] Behavioral compliance for dynamic versus static signs in an immersive virtual environment [J].

Duarte, Emilia ;

Rebelo, Francisco ;

Teles, Julia ;

Wogalter, Michael S. .

APPLIED ERGONOMICS, 2014, 45 (05) :1367-1375

[16] Distributed multi-robot collision avoidance via deep reinforcement learning for navigation in complex scenarios [J].

Fan, Tingxiang ;

Long, Pinxin ;

Liu, Wenxi ;

Pan, Jia .

INTERNATIONAL JOURNAL OF ROBOTICS RESEARCH, 2020, 39 (07) :856-892

[17]

Gleave A., 2019, P INT C LEARN REPR

[18] Multi-agent deep reinforcement learning: a survey [J].

Gronauer, Sven ;

Diepold, Klaus .

ARTIFICIAL INTELLIGENCE REVIEW, 2022, 55 (02) :895-943

[19]

Guo Wenbo, 2021, P MACHINE LEARNING R, V139

[20] Application of bug navigation algorithms for large-scale agent-based evacuation modeling to support decision making [J].

Haghpanah, Fardad ;

Schafer, Benjamin W. ;

Castro, Sebastian .

FIRE SAFETY JOURNAL, 2021, 122

← 1 2 3 4 5 6 7 →