Fuzzy Q-learning in continuous state and action space

被引：4

作者：

Xu M.-L. ^{[1
,2
]}

Xu W.-B. ^{[2
]}

机构：

[1] Department of Electronic Information Engineering, Wuxi City College of Vocational Technology

[2] School of Information Technology, Jiangnan University

来源：

Journal of China Universities of Posts and Telecommunications | 2010年 / 17卷 / 04期

基金：

中国国家自然科学基金;

关键词：

adaptation; continuous; FIS; Q-learning;

D O I：

10.1016/S1005-8885(09)60495-7

中图分类号：

学科分类号：

摘要：

An adaptive fuzzy Q-learning (AFQL) based on fuzzy inference systems (FIS) is proposed. The FIS realized by a normalized radial basis function (NRBF) neural network is used to approach Q-value function, whose input is composed of state and action. The rules of FIS are created incrementally according to the novelty of each element of the pair of state-action. Moreover the premise part and consequent part of the FIS are updated using extended Kalman filter (EKF). The action that impacts on environment is the one with maximum output of FIS in the current state and generated through optimization method. Simulation results in the wall-following task of mobile robots and the inverted pendulum balancing problem demonstrate that the superiority and applicability of the proposed AFQL method. © 2010 The Journal of China Universities of Posts and Telecommunications.

引用

页码：100 / 109

页数：9

共 50 条

[21] Backward Q-learning: The combination of Sarsa algorithm and Q-learning [J].

Wang, Yin-Hao ;

Li, Tzuu-Hseng S. ;

Lin, Chih-Jui .

ENGINEERING APPLICATIONS OF ARTIFICIAL INTELLIGENCE, 2013, 26 (09) :2184-2193

[22] Incorporating Expert Knowledge in Q-Learning by means of Fuzzy Rules [J].

Pourhassan, Mojgan ;

Mozayani, Nasser .

2009 IEEE INTERNATIONAL CONFERENCE ON COMPUTATIONAL INTELLIGENCE FOR MEASUREMENT SYSTEMS AND APPLICATIONS, 2009, :219-222

[23] Dynamic Level of Difficulties Using Q-Learning and Fuzzy Logic [J].

Annisa Damastuti, Fardani ;

Firmansyah, Kenan ;

Miftachul Arif, Yunifa ;

Dutono, Titon ;

Barakbah, Aliridho ;

Hariadi, Mochamad .

IEEE ACCESS, 2024, 12 :137775-137789

[24] Multiresolution State-Space Discretization Method for Q-Learning with Function Approximation and Policy Iteration [J].

Lampton, Amanda ;

Valasek, John .

2009 IEEE INTERNATIONAL CONFERENCE ON SYSTEMS, MAN AND CYBERNETICS (SMC 2009), VOLS 1-9, 2009, :2677-2682

[25] NAO robot obstacle avoidance based on fuzzy Q-learning [J].

Wen, Shuhuan ;

Hu, Xueheng ;

Li, Zhen ;

Lam, Hak Keung ;

Sun, Fuchun ;

Fang, Bin .

INDUSTRIAL ROBOT-THE INTERNATIONAL JOURNAL OF ROBOTICS RESEARCH AND APPLICATION, 2020, 47 (06) :801-811

[26] Improving Q-learning by using the agent's action history [J].

Saito M. ;

Sekozawa T. .

IEEJ Transactions on Electronics, Information and Systems, 2016, 136 (08) :1209-1217

[27] Learning rates for Q-learning [J].

Even-Dar, E ;

Mansour, Y .

JOURNAL OF MACHINE LEARNING RESEARCH, 2003, 5 :1-25

[28] A Path Planning Algorithm for Space Manipulator Based on Q-Learning [J].

Li, Taiguo ;

Li, Quanhong ;

Li, Wenxi ;

Xia, Jiagao ;

Tang, Wenhua ;

Wang, Weiwen .

PROCEEDINGS OF 2019 IEEE 8TH JOINT INTERNATIONAL INFORMATION TECHNOLOGY AND ARTIFICIAL INTELLIGENCE CONFERENCE (ITAIC 2019), 2019, :1566-1571

[29] Q-learning based on neural network in learning action selection of mobile robot [J].

Qiao, Junfei ;

Hou, Zhanjun ;

Ruan, Xiaogang .

2007 IEEE INTERNATIONAL CONFERENCE ON AUTOMATION AND LOGISTICS, VOLS 1-6, 2007, :263-267

[30] CVaR Q-Learning [J].

Stanko, Silvestr ;

Macek, Karel .

COMPUTATIONAL INTELLIGENCE: 11th International Joint Conference, IJCCI 2019, Vienna, Austria, September 17-19, 2019, Revised Selected Papers, 2021, 922 :333-358

← 1 2 3 4 5 →