Additional planning with multiple objectives for reinforcement learning

被引:11
|
作者
Pan, Anqi [1 ,2 ]
Xu, Wenjun [3 ,4 ]
Wang, Lei [5 ]
Ren, Hongliang [3 ]
机构
[1] Donghua Univ, Coll Informat Sci & Technol, Shanghai 201620, Peoples R China
[2] Donghua Univ, Engn Res Ctr Digitized Text & Fash Technol, Minist Educ, Shanghai 201620, Peoples R China
[3] Natl Univ Singapore, Dept Biomed Engn, Singapore 117583, Singapore
[4] Peng Cheng Lab, Shenzhen 518055, Peoples R China
[5] Tongji Univ, Sch Elect & Informat Engn, Shanghai 201804, Peoples R China
关键词
Reinforcement learning; Multi-objective; Robotic control; ALGORITHM;
D O I
10.1016/j.knosys.2019.105392
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
Most control tasks have multiple objectives that need to be achieved simultaneously, while the reward definition is the weighted combination of all objects to determine one optimal policy. This configuration has a limitation in exploration flexibility and presents difficulty in reaching a satisfied terminate condition. Although some multi-objective reinforcement learning (MORL) methods have been presented recently, they concentrate on obtaining a set of compromising options rather than one best-performed strategy. On the other hand, the existing policy-improve methods have rarely emphasized on solving multiple objectives circumstances. Inspired by the enhanced policy search methods, an additional planning technique with multiple objectives for reinforcement learning is proposed in this paper, which is denoted as RLAP-MOP. This method provides opportunities to evaluate parallel requirements at the same time and suggests several optimal feasible actions to improve long-term performance further. Meanwhile, the short-term planning adopted in this paper has advantages in maintaining safe trajectories and building more accurate approximate models, which contributes to accelerating the training program. For comparison, an RLAP with single-objective optimization is also introduced in theoretical and experimental studies. The proposed techniques are investigated on a multi-objective cartpole environment and a soft robotic palpation task. The superiorities in the improved return values and learning stability prove that the multiple objectives based additional planning is a promising assistant to improve reinforcement learning. (c) 2019 Elsevier B.V. All rights reserved.
引用
收藏
页数:10
相关论文
共 50 条
  • [31] Approximate planning for bayesian hierarchical reinforcement learning
    Ngo Anh Vien
    Hung Ngo
    Sungyoung Lee
    TaeChoong Chung
    Applied Intelligence, 2014, 41 : 808 - 819
  • [32] Transmission Expansion Planning Based on Reinforcement Learning
    Wang Y.
    Hu S.
    Song Y.
    Jiang L.
    Shen L.
    Dianwang Jishu/Power System Technology, 2021, 45 (07): : 2829 - 2838
  • [33] Reinforcement learning applied to production planning and control
    Esteso, Ana
    Peidro, David
    Mula, Josefa
    Diaz-Madronero, Manuel
    INTERNATIONAL JOURNAL OF PRODUCTION RESEARCH, 2023, 61 (16) : 5772 - 5789
  • [34] Aggregation of multiple reinforcement learning algorithms
    Jiang, Ju
    Kamel, Mohamed S.
    Chen, Lei
    INTERNATIONAL JOURNAL ON ARTIFICIAL INTELLIGENCE TOOLS, 2006, 15 (05) : 855 - 861
  • [35] Traffic Signal Optimization for Multiple Intersections Based on Reinforcement Learning
    Gu, Jaun
    Lee, Minhyuck
    Jun, Chulmin
    Han, Yohee
    Kim, Youngchan
    Kim, Junwon
    APPLIED SCIENCES-BASEL, 2021, 11 (22):
  • [36] Developing a Container Ship Loading-Planning Program Using Reinforcement Learning
    Cho, Jaehyeok
    Ku, Namkug
    JOURNAL OF MARINE SCIENCE AND ENGINEERING, 2024, 12 (10)
  • [37] Automatic Itinerary Planning Using Triple-Agent Deep Reinforcement Learning
    Chen, Bo-Hao
    Han, Jin
    Chen, Shengxin
    Yin, Jia-Li
    Chen, Zhaojiong
    IEEE TRANSACTIONS ON INTELLIGENT TRANSPORTATION SYSTEMS, 2022, 23 (10) : 18864 - 18875
  • [38] Scene Mover: Automatic Move Planning for Scene Arrangement by Deep Reinforcement Learning
    Wang, Hanqing
    Liang, Wei
    Yu, Lap-Fai
    ACM TRANSACTIONS ON GRAPHICS, 2020, 39 (06):
  • [39] Improved Reinforcement Learning Task Supervisor for Path Planning of Logistics Autonomous System
    Pan, Congjie
    Zhang, Zhenyi
    Chen, Yutao
    Lin, Dingci
    Huang, Jie
    IFAC PAPERSONLINE, 2023, 56 (02): : 10010 - 10015
  • [40] Uncertain UAV ISR mission planning problem with multiple correlated objectives
    Wang, Zutong
    Zheng, Mingfa
    Guo, Jiansheng
    Huang, Hanqiao
    JOURNAL OF INTELLIGENT & FUZZY SYSTEMS, 2017, 32 (01) : 321 - 335