A Dual Decision-Making Continuous Reinforcement Learning Method Based on Sim2Real

被引:0
作者
Xiao, Wenwen [1 ]
Wang, Xinzhi [1 ]
Luo, Xiangfeng [1 ]
Xie, Shaorong [1 ]
机构
[1] Shanghai Univ, Sch Comp Engn & Sci, Shanghai 200444, Peoples R China
基金
中国国家自然科学基金; 上海市自然科学基金;
关键词
Social computing; continuous learning; reinforcement learning; simulation to reality (Sim2Real);
D O I
10.1142/S0218194023500626
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
Continuous reinforcement learning carries potential security risks when applied in real-world scenarios, which could have significant societal implications. While its field of application is expanding, the majority of applications still remain confined to virtual environments. If only a single continuous learning method is applied to an unmanned system, it will still forget previously learned experiences, and retraining will be required when it encounters unknown environments. This reduces the learning efficiency of the unmanned system. To address these issues, some scholars have suggested prioritizing the experience playback pool and using transfer learning to apply previously learned strategies to new environments. However, these methods only alleviate the speed at which the unmanned system forgets its experiences and do not fundamentally solve the problem. Additionally, they cannot prevent dangerous actions and falling into local optima. Therefore, we propose a dual decision-making continuous learning method based on simulation to reality (Sim2Real). This method employs a knowledge body to eliminate the local optimal dilemma, and corrects bad strategies in a timely manner to ensure that the unmanned system makes the best decision every time. Our experimental results demonstrate that our method has a 30% higher success rate than other state-of-the-art methods, and the model transfer to real scenes is still highly effective.
引用
收藏
页码:467 / 488
页数:22
相关论文
共 36 条
  • [1] Meta-Reinforcement Learning in Non-Stationary and Dynamic Environments
    Bing, Zhenshan
    Lerch, David
    Huang, Kai
    Knoll, Alois
    [J]. IEEE TRANSACTIONS ON PATTERN ANALYSIS AND MACHINE INTELLIGENCE, 2023, 45 (03) : 3476 - 3491
  • [2] Buzzega P, 2020, ADV NEURAL INFORM PR
  • [3] State of the art: a review of sentiment analysis based on sequential transfer learning
    Chan, Jireh Yi-Le
    Bea, Khean Thye
    Leow, Steven Mun Hong
    Phoong, Seuk Wai
    Cheng, Wai Khuen
    [J]. ARTIFICIAL INTELLIGENCE REVIEW, 2023, 56 (01) : 749 - 780
  • [4] Chaplot DS, 2020, PROC CVPR IEEE, P12872, DOI 10.1109/CVPR42600.2020.01289
  • [5] Falkner S., 6 INT C LEARNING REP, P1
  • [6] Farahani A., 2021, ADV DAT SCI INF ENG, P877
  • [7] MAML2: meta reinforcement learning via meta-learning for task categories
    Fu, Qiming
    Wang, Zhechao
    Fang, Nengwei
    Xing, Bin
    Zhang, Xiao
    Chen, Jianping
    [J]. FRONTIERS OF COMPUTER SCIENCE, 2023, 17 (04)
  • [8] Haarnoja T, 2018, PR MACH LEARN RES, V80
  • [9] DeXtreme: Transfer of Agile In-hand Manipulation from Simulation to Reality
    Handa, Ankur
    Allshire, Arthur
    Makoviychuk, Viktor
    Petrenko, Aleksei
    Singh, Ritvik
    Liu, Jingzhou
    Makoviichuk, Denys
    Van Wyk, Karl
    Zhurkevich, Alexander
    Sundaralingam, Balakumar
    Narang, Yashraj
    Lafleche, Jean-Francois
    Fox, Dieter
    State, Gavriel
    [J]. 2023 IEEE INTERNATIONAL CONFERENCE ON ROBOTICS AND AUTOMATION, ICRA, 2023, : 5977 - 5984
  • [10] Jin J, 2020, IEEE INT CONF ROBOT, P6979, DOI [10.1109/ICRA40945.2020.9197148, 10.1109/icra40945.2020.9197148]