Collaborative promotion: Achieving safety and task performance by integrating imitation reinforcement learning

被引:0
|
作者
Zhang, Cai [1 ]
Zhang, Xiaoxiong [2 ,3 ]
Zhang, Hui [2 ,3 ]
Zhu, Fei [1 ]
机构
[1] Soochow Univ, Sch Comp Sci & Technol, Suzhou 215006, Peoples R China
[2] Natl Univ Def Technol, Sixty Res Inst 3, Nanjing 210007, Peoples R China
[3] Natl Univ Def Technol, Lab big data & decis, Changsha 410073, Peoples R China
基金
中国国家自然科学基金;
关键词
Safe reinforcement learning; Imitation learning; Dual policy networks; Multi-objective optimization; Loose coupling;
D O I
10.1016/j.eswa.2024.124820
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
Although the importance of safety is self-evident for artificial intelligence, like the two sides of a coin, excessively focusing on safety performance without considering task performance may cause the agent to become conservative and thus hesitant. How to make a balance between safety and task performance has been a pressing concern. To address this issue, we introduce Collaborative Promotion (CP) that is designed to harmonize safety and task objectives, thereby enabling a loosely coupled optimization of dual objectives. CP is a novel dual-policy framework where the safety and task objectives are assigned to the safety policy framework and task policy framework, respectively, as their primary goals. The actor-critic framework is constructed using the value function to guide the enhancement of these primary objectives. With the aid of imitation learning, secondary objective optimization is achieved through behavioral cloning, with each framework considering the other as an expert in its domain. The safety policy framework employs a weighted sum method for multi- objective optimization, establishing a primary-secondary relationship to facilitate loosely coupled optimization of safety and task objectives. In the realms of Safe Navigation and Safe Velocity, we have benchmarked CP against task-specific and safety-specific algorithms. Extensive experiments demonstrate that CP achieves the intended goals.
引用
收藏
页数:12
相关论文
共 50 条
  • [31] The Effects Of Different Task Types On Learners' Performance In Collaborative Virtual Learning Environment
    Yadollahi, Hadisesh
    Rahimi, Ali
    PROCEEDINGS OF 2ND GLOBAL CONFERENCE ON CONFERENCE ON LINGUISTICS AND FOREIGN LANGUAGE TEACHING, 2015, 192 : 526 - 533
  • [32] Task-Driven Autonomous Driving: Balanced Strategies Integrating Curriculum Reinforcement Learning and Residual Policy
    Shi, Jiamin
    Zhang, Tangyike
    Zong, Ziqi
    Chen, Shitao
    Xin, Jingmin
    Zheng, Nanning
    IEEE ROBOTICS AND AUTOMATION LETTERS, 2024, 9 (11): : 9454 - 9461
  • [33] Task Offloading in Cloud-Edge Collaborative Environment Based on Deep Reinforcement Learning and Fuzzy Logic
    Wu, Xiaojun
    Wang, Lulu
    Yuan, Sheng
    Chai, Wei
    2024 IEEE 4TH INTERNATIONAL CONFERENCE ON SOFTWARE ENGINEERING AND ARTIFICIAL INTELLIGENCE, SEAI 2024, 2024, : 301 - 308
  • [34] Deep Reinforcement Learning-Based URLLC-Aware Task Offloading in Collaborative Vehicular Networks
    Pan, Chao
    Wang, Zhao
    Zhou, Zhenyu
    Ren, Xincheng
    CHINA COMMUNICATIONS, 2021, 18 (07) : 134 - 146
  • [35] Deep Reinforcement Learning-Based URLLC-Aware Task Offloading in Collaborative Vehicular Networks
    Chao Pan
    Zhao Wang
    Zhenyu Zhou
    Xincheng Ren
    中国通信, 2021, 18 (07) : 134 - 146
  • [36] Performance, robustness, and portability of imitation-assisted reinforcement learning policies for shading and natural ventilation control
    Park, Bumsoo
    Rempel, Alexandra R.
    Mishra, Sandipan
    APPLIED ENERGY, 2023, 347
  • [37] VEC Collaborative Task Offloading and Resource Allocation Based on Deep Reinforcement Learning Under Parking Assistance
    Xue, Jianbin
    Shao, Fei
    Zhang, Tingjuan
    Tian, Guiying
    Jiang, Hengjie
    WIRELESS PERSONAL COMMUNICATIONS, 2024, 136 (01) : 321 - 345
  • [38] Human-robot collaborative assembly task planning for mobile cobots based on deep reinforcement learning
    Hou, Wenbin
    Xiong, Zhihua
    Yue, Ming
    Chen, Hao
    PROCEEDINGS OF THE INSTITUTION OF MECHANICAL ENGINEERS PART C-JOURNAL OF MECHANICAL ENGINEERING SCIENCE, 2024, 238 (23) : 11097 - 11114
  • [39] Collaborative Suturing: A Reinforcement Learning Approach to Automate Hand-off Task in Suturing for Surgical Robots
    Varier, Vignesh Manoj
    Rajamani, Dhruv Kool
    Goldfarb, Nathaniel
    Tavakkolmoghaddam, Farid
    Munawar, Adnan
    Fischer, Gregory S.
    2020 29TH IEEE INTERNATIONAL CONFERENCE ON ROBOT AND HUMAN INTERACTIVE COMMUNICATION (RO-MAN), 2020, : 1380 - 1386
  • [40] Multi-agent deep reinforcement learning for collaborative task offloading in mobile edge computing networks
    Chen, Minxuan
    Guo, Aihuang
    Song, Chunlin
    DIGITAL SIGNAL PROCESSING, 2023, 140