A Dual Decision-Making Continuous Reinforcement Learning Method Based on Sim2Real

被引：0

作者：

Xiao, Wenwen ^{[1
]}

Wang, Xinzhi ^{[1
]}

Luo, Xiangfeng ^{[1
]}

Xie, Shaorong ^{[1
]}

机构：

[1] Shanghai Univ, Sch Comp Engn & Sci, Shanghai 200444, Peoples R China

来源：

INTERNATIONAL JOURNAL OF SOFTWARE ENGINEERING AND KNOWLEDGE ENGINEERING | 2024年 / 34卷 / 03期

基金：

中国国家自然科学基金; 上海市自然科学基金;

关键词：

Social computing; continuous learning; reinforcement learning; simulation to reality (Sim2Real);

D O I：

10.1142/S0218194023500626

中图分类号：

TP18 [人工智能理论];

学科分类号：

081104 ; 0812 ; 0835 ; 1405 ;

摘要：

Continuous reinforcement learning carries potential security risks when applied in real-world scenarios, which could have significant societal implications. While its field of application is expanding, the majority of applications still remain confined to virtual environments. If only a single continuous learning method is applied to an unmanned system, it will still forget previously learned experiences, and retraining will be required when it encounters unknown environments. This reduces the learning efficiency of the unmanned system. To address these issues, some scholars have suggested prioritizing the experience playback pool and using transfer learning to apply previously learned strategies to new environments. However, these methods only alleviate the speed at which the unmanned system forgets its experiences and do not fundamentally solve the problem. Additionally, they cannot prevent dangerous actions and falling into local optima. Therefore, we propose a dual decision-making continuous learning method based on simulation to reality (Sim2Real). This method employs a knowledge body to eliminate the local optimal dilemma, and corrects bad strategies in a timely manner to ensure that the unmanned system makes the best decision every time. Our experimental results demonstrate that our method has a 30% higher success rate than other state-of-the-art methods, and the model transfer to real scenes is still highly effective.

引用

页码：467 / 488

页数：22

共 36 条

[1] Meta-Reinforcement Learning in Non-Stationary and Dynamic Environments
Bing, Zhenshan
Lerch, David
Huang, Kai
Knoll, Alois
[J]. IEEE TRANSACTIONS ON PATTERN ANALYSIS AND MACHINE INTELLIGENCE, 2023, 45 (03) : 3476 - 3491
[2] Buzzega P, 2020, ADV NEURAL INFORM PR
[3] State of the art: a review of sentiment analysis based on sequential transfer learning
Chan, Jireh Yi-Le
Bea, Khean Thye
Leow, Steven Mun Hong
Phoong, Seuk Wai
Cheng, Wai Khuen
[J]. ARTIFICIAL INTELLIGENCE REVIEW, 2023, 56 (01) : 749 - 780
[4] Chaplot DS, 2020, PROC CVPR IEEE, P12872, DOI 10.1109/CVPR42600.2020.01289
[5] Falkner S., 6 INT C LEARNING REP, P1
[6] Farahani A., 2021, ADV DAT SCI INF ENG, P877
[7] MAML2: meta reinforcement learning via meta-learning for task categories
Fu, Qiming
Wang, Zhechao
Fang, Nengwei
Xing, Bin
Zhang, Xiao
Chen, Jianping
[J]. FRONTIERS OF COMPUTER SCIENCE, 2023, 17 (04)
[8] Haarnoja T, 2018, PR MACH LEARN RES, V80
[9] DeXtreme: Transfer of Agile In-hand Manipulation from Simulation to Reality
Handa, Ankur
Allshire, Arthur
Makoviychuk, Viktor
Petrenko, Aleksei
Singh, Ritvik
Liu, Jingzhou
Makoviichuk, Denys
Van Wyk, Karl
Zhurkevich, Alexander
Sundaralingam, Balakumar
Narang, Yashraj
Lafleche, Jean-Francois
Fox, Dieter
State, Gavriel
[J]. 2023 IEEE INTERNATIONAL CONFERENCE ON ROBOTICS AND AUTOMATION, ICRA, 2023, : 5977 - 5984
[10] Jin J, 2020, IEEE INT CONF ROBOT, P6979, DOI [10.1109/ICRA40945.2020.9197148, 10.1109/icra40945.2020.9197148]

← 1 2 3 4 →