Reinforcement learning and aggregation

被引：0

作者：

Jiang, J ^{[1
]}

Kamel, M ^{[1
]}

Chen, L ^{[1
]}

机构：

[1] Univ Waterloo, Dept Elect & Comp Engn, Waterloo, ON N2L 3G1, Canada

来源：

2004 IEEE INTERNATIONAL CONFERENCE ON SYSTEMS, MAN & CYBERNETICS, VOLS 1-7 | 2004年

关键词：

reinforcement learning; multiagent systems; aggregation;

D O I：

暂无

中图分类号：

TP [自动化技术、计算机技术];

学科分类号：

0812 ;

摘要：

Reinforcement learning (RL) is a learning technique that provides a means for learning an optimal control policy when the dynamics of the environment under consideration is unavailable [7, 13]. While RL has been successfully applied in many, single or multiple agents systems [1, 3, 14, 10], the learning quality is greatly influenced by learning algorithms and their parameters. Setting of the parameters of RL algorithms is something of a black art, and small differences in these parameters can lead to large differences in learning qualities. Determining the best algorithm. and the optimal parameters can be costly in terms of time and computation. Even if the cost is acceptable, the robustness of learning is still. a question. In order to address the difficulty, an Aggregated Multiagent Reinforcement Learning System. (AMRLS) is proposed to deal with the RL environment as a multiagent environment. A maze world environment is used to validate the AMRLS. Experimental results illustrate that compared with normal Q(lambda)-learning and SARSA(lambda) algorithms, the AMRLS increases both the learning speed and the rate of reaching the shortest path.

引用

页码：1303 / 1308

页数：6

共 50 条

[31] Decentralized Reinforcement Learning of Robot Behaviors
Leottau, David L.
Ruiz-del-Solar, Javier
Babuska, Robert
ARTIFICIAL INTELLIGENCE, 2018, 256 : 130 - 159
[32] Reinforcement learning for RoboCup soccer keepaway
Stone, P
Sutton, RS
Kuhlmann, G
ADAPTIVE BEHAVIOR, 2005, 13 (03) : 165 - 188
[33] Noble reinforcement in disjunctive aggregation operators
Yager, RR
IEEE TRANSACTIONS ON FUZZY SYSTEMS, 2003, 11 (06) : 754 - 767
[34] Reinforcement learning for opportunistic maintenance optimization
Andreas Kuhnle
Johannes Jakubik
Gisela Lanza
Production Engineering, 2019, 13 : 33 - 41
[35] Reinforcement learning with partitioning function system
李伟
叶庆泰
朱昌明
Journal of Harbin Institute of Technology, 2004, (04) : 377 - 381
[36] Reinforcement learning for opportunistic maintenance optimization
Kuhnle, Andreas
Jakubik, Johannes
Lanza, Gisela
PRODUCTION ENGINEERING-RESEARCH AND DEVELOPMENT, 2019, 13 (01): : 33 - 41
[37] Reinforcement Learning for Mixed Autonomy Intersections
Yan, Zhongxia
Wu, Cathy
2021 IEEE INTELLIGENT TRANSPORTATION SYSTEMS CONFERENCE (ITSC), 2021, : 2089 - 2094
[38] Social Reinforcement Learning in Game Playing
Kiourt, Chairi
Kalles, Dimitris
2012 IEEE 24TH INTERNATIONAL CONFERENCE ON TOOLS WITH ARTIFICIAL INTELLIGENCE (ICTAI 2012), VOL 1, 2012, : 322 - 326
[39] Reinforcement learning for ridesharing: An extended survey
Qin, Zhiwei
Zhu, Hongtu
Ye, Jieping
TRANSPORTATION RESEARCH PART C-EMERGING TECHNOLOGIES, 2022, 144
[40] CuMARL: Curiosity-Based Learning in Multiagent Reinforcement Learning
Ningombam, Devarani Devi
Yoo, Byunghyun
Kim, Hyun Woo
Song, Hwa Jeon
Yi, Sungwon
IEEE ACCESS, 2022, 10 : 87254 - 87265

← 1 2 3 4 5 →