Additional planning with multiple objectives for reinforcement learning

被引:11
|
作者
Pan, Anqi [1 ,2 ]
Xu, Wenjun [3 ,4 ]
Wang, Lei [5 ]
Ren, Hongliang [3 ]
机构
[1] Donghua Univ, Coll Informat Sci & Technol, Shanghai 201620, Peoples R China
[2] Donghua Univ, Engn Res Ctr Digitized Text & Fash Technol, Minist Educ, Shanghai 201620, Peoples R China
[3] Natl Univ Singapore, Dept Biomed Engn, Singapore 117583, Singapore
[4] Peng Cheng Lab, Shenzhen 518055, Peoples R China
[5] Tongji Univ, Sch Elect & Informat Engn, Shanghai 201804, Peoples R China
关键词
Reinforcement learning; Multi-objective; Robotic control; ALGORITHM;
D O I
10.1016/j.knosys.2019.105392
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
Most control tasks have multiple objectives that need to be achieved simultaneously, while the reward definition is the weighted combination of all objects to determine one optimal policy. This configuration has a limitation in exploration flexibility and presents difficulty in reaching a satisfied terminate condition. Although some multi-objective reinforcement learning (MORL) methods have been presented recently, they concentrate on obtaining a set of compromising options rather than one best-performed strategy. On the other hand, the existing policy-improve methods have rarely emphasized on solving multiple objectives circumstances. Inspired by the enhanced policy search methods, an additional planning technique with multiple objectives for reinforcement learning is proposed in this paper, which is denoted as RLAP-MOP. This method provides opportunities to evaluate parallel requirements at the same time and suggests several optimal feasible actions to improve long-term performance further. Meanwhile, the short-term planning adopted in this paper has advantages in maintaining safe trajectories and building more accurate approximate models, which contributes to accelerating the training program. For comparison, an RLAP with single-objective optimization is also introduced in theoretical and experimental studies. The proposed techniques are investigated on a multi-objective cartpole environment and a soft robotic palpation task. The superiorities in the improved return values and learning stability prove that the multiple objectives based additional planning is a promising assistant to improve reinforcement learning. (c) 2019 Elsevier B.V. All rights reserved.
引用
收藏
页数:10
相关论文
共 50 条
  • [1] The Multiple Objectives Flexible Jobshop Scheduling Using Reinforcement Learning
    Khuntiyaporn, Thanaphut
    Songmuang, Pokpong
    Limprasert, Wasit
    16TH INTERNATIONAL JOINT SYMPOSIUM ON ARTIFICIAL INTELLIGENCE AND NATURAL LANGUAGE PROCESSING (ISAI-NLP 2021), 2021,
  • [2] A Unifying Framework for Reinforcement Learning and Planning
    Moerland, Thomas M.
    Broekens, Joost
    Plaat, Aske
    Jonker, Catholijn M.
    FRONTIERS IN ARTIFICIAL INTELLIGENCE, 2022, 5
  • [3] Reinforcement learning for disassembly sequence planning optimization
    Allagui, Amal
    Belhadj, Imen
    Plateaux, Regis
    Hammadi, Moncef
    Penas, Olivia
    Aifaoui, Nizar
    COMPUTERS IN INDUSTRY, 2023, 151
  • [4] Practical Reinforcement Learning For MPC: Learning from sparse objectives in under an hour on a real robot
    Karnchanachari, Napat
    Valls, Miguel I.
    Hoeller, David
    Hutter, Marco
    LEARNING FOR DYNAMICS AND CONTROL, VOL 120, 2020, 120 : 211 - 224
  • [5] Deep reinforcement learning for resilient microgrid expansion planning with multiple energy resource
    Pang, Kexin
    Zhou, Jian
    Tsianikas, Stamatis
    Ma, Yizhong
    QUALITY AND RELIABILITY ENGINEERING INTERNATIONAL, 2024, 40 (01) : 34 - 56
  • [6] Network Planning with Deep Reinforcement Learning
    Zhu, Hang
    Gupta, Varun
    Ahuja, Satyajeet Singh
    Tian, Yuandong
    Zhang, Ying
    Jin, Xin
    SIGCOMM '21: PROCEEDINGS OF THE 2021 ACM SIGCOMM 2021 CONFERENCE, 2021, : 258 - 271
  • [7] Structure and Randomness in Planning and Reinforcement Learning
    Czechowski, Konrad
    Januszewski, Piotr
    Kozakowski, Piotr
    Kucinski, Lukasz
    Milos, Piotr
    2021 INTERNATIONAL JOINT CONFERENCE ON NEURAL NETWORKS (IJCNN), 2021,
  • [8] Collision-Free Path Planning for Multiple Drones Based on Safe Reinforcement Learning
    Chen, Hong
    Huang, Dan
    Wang, Chenggang
    Ding, Lu
    Song, Lei
    Liu, Hongtao
    DRONES, 2024, 8 (09)
  • [9] Energy-Efficient Online Path Planning of Multiple Drones Using Reinforcement Learning
    Hong, Dooyoung
    Lee, Seonhoon
    Cho, Young Hoo
    Baek, Donkyu
    Kim, Jaemin
    Chang, Naehyuck
    IEEE TRANSACTIONS ON VEHICULAR TECHNOLOGY, 2021, 70 (10) : 9725 - 9740
  • [10] Connected vehicles' dynamic route planning based on reinforcement learning
    Ma, Kanghua
    Liao, Shubing
    Niu, Yunyun
    FUTURE GENERATION COMPUTER SYSTEMS-THE INTERNATIONAL JOURNAL OF ESCIENCE, 2024, 153 : 375 - 390