Collaborative promotion: Achieving safety and task performance by integrating imitation reinforcement learning

被引:0
|
作者
Zhang, Cai [1 ]
Zhang, Xiaoxiong [2 ,3 ]
Zhang, Hui [2 ,3 ]
Zhu, Fei [1 ]
机构
[1] Soochow Univ, Sch Comp Sci & Technol, Suzhou 215006, Peoples R China
[2] Natl Univ Def Technol, Sixty Res Inst 3, Nanjing 210007, Peoples R China
[3] Natl Univ Def Technol, Lab big data & decis, Changsha 410073, Peoples R China
基金
中国国家自然科学基金;
关键词
Safe reinforcement learning; Imitation learning; Dual policy networks; Multi-objective optimization; Loose coupling;
D O I
10.1016/j.eswa.2024.124820
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
Although the importance of safety is self-evident for artificial intelligence, like the two sides of a coin, excessively focusing on safety performance without considering task performance may cause the agent to become conservative and thus hesitant. How to make a balance between safety and task performance has been a pressing concern. To address this issue, we introduce Collaborative Promotion (CP) that is designed to harmonize safety and task objectives, thereby enabling a loosely coupled optimization of dual objectives. CP is a novel dual-policy framework where the safety and task objectives are assigned to the safety policy framework and task policy framework, respectively, as their primary goals. The actor-critic framework is constructed using the value function to guide the enhancement of these primary objectives. With the aid of imitation learning, secondary objective optimization is achieved through behavioral cloning, with each framework considering the other as an expert in its domain. The safety policy framework employs a weighted sum method for multi- objective optimization, establishing a primary-secondary relationship to facilitate loosely coupled optimization of safety and task objectives. In the realms of Safe Navigation and Safe Velocity, we have benchmarked CP against task-specific and safety-specific algorithms. Extensive experiments demonstrate that CP achieves the intended goals.
引用
收藏
页数:12
相关论文
共 50 条
  • [41] Integrating Implementation Science in a Quality and Patient Safety Improvement Learning Collaborative: Essential Ingredients and Impact
    Jeffs, Lianne
    Bruno, Frances
    Zeng, Rui Lin
    Schonewille, Noah
    Kinder, Kim
    De Souza, Gina
    D'Arpino, Maryanne
    Baker, G. Ross
    JOINT COMMISSION JOURNAL ON QUALITY AND PATIENT SAFETY, 2023, 49 (05): : 255 - 264
  • [42] Using Goal-Conditioned Reinforcement Learning With Deep Imitation to Control Robot Arm in Flexible Flat Cable Assembly Task
    Li, Jingchen
    Shi, Haobin
    Hwang, Kao-Shing
    IEEE TRANSACTIONS ON AUTOMATION SCIENCE AND ENGINEERING, 2024, 21 (04) : 6217 - 6228
  • [43] Performance Analysis of Reinforcement Learning for Achieving Context Awareness and Intelligence in Mobile Cognitive Radio Networks
    Yau, Kok-Lim Alvin
    Komisarczuk, Peter
    Teal, Paul D.
    25TH IEEE INTERNATIONAL CONFERENCE ON ADVANCED INFORMATION NETWORKING AND APPLICATIONS (AINA 2011), 2011, : 1 - 8
  • [44] Performance Analysis of Reinforcement Learning for Achieving Context-Awareness and Intelligence in Cognitive Radio Networks
    Yau, Kok-Lim Alvin
    Komisarczuk, Peter
    Teal, Paul D.
    2009 IEEE 34TH CONFERENCE ON LOCAL COMPUTER NETWORKS (LCN 2009), 2009, : 1046 - 1053
  • [45] Autonomic responses to choice outcomes: Links to task performance and reinforcement-learning parameters
    Hayes, William M.
    Wedell, Douglas H.
    BIOLOGICAL PSYCHOLOGY, 2020, 156
  • [46] Deep Reinforcement Learning Task Scheduling Method for Real-Time Performance Awareness
    Wang, Jinming
    Li, Shaobo
    Zhang, Xingxing
    Zhu, Keyu
    Xie, Cankun
    Wu, Fengbin
    IEEE ACCESS, 2025, 13 : 31385 - 31400
  • [47] A Time-saving Task Scheduling Algorithm Based on Deep Reinforcement Learning for Edge Cloud Collaborative Computing
    Zou, Wenhao
    Zhang, Zongshuai
    Wang, Nina
    Tan, Xiaochen
    Tian, Lin
    2024 IEEE 99TH VEHICULAR TECHNOLOGY CONFERENCE, VTC2024-SPRING, 2024,
  • [48] A collaborative computation and dependency-aware task offloading method for vehicular edge computing: a reinforcement learning approach
    Liu, Guozhi
    Dai, Fei
    Huang, Bi
    Qiang, Zhenping
    Wang, Shuai
    Li, Lecheng
    JOURNAL OF CLOUD COMPUTING-ADVANCES SYSTEMS AND APPLICATIONS, 2022, 11 (01):
  • [49] Collaborative task decision-making of multi-UUV in dynamic environments based on deep reinforcement learning
    Yu, Haomiao
    Zhang, Sijie
    SHIPS AND OFFSHORE STRUCTURES, 2024,
  • [50] Asynchronous Deep Reinforcement Learning for Collaborative Task Computing and On-Demand Resource Allocation in Vehicular Edge Computing
    Liu L.
    Feng J.
    Mu X.
    Pei Q.
    Lan D.
    Xiao M.
    IEEE Transactions on Intelligent Transportation Systems, 2023, 24 (12) : 15513 - 15526