Policy gradient fuzzy reinforcement learning

被引：0

作者：

Wang, XN ^{[1
]}

Xu, X ^{[1
]}

He, HG ^{[1
]}

机构：

[1] Natl Univ Def Technol, Inst Automat, Changsha 410073, Peoples R China

来源：

PROCEEDINGS OF THE 2004 INTERNATIONAL CONFERENCE ON MACHINE LEARNING AND CYBERNETICS, VOLS 1-7 | 2004年

关键词：

reinforcement learning; fuzzy control; policy gradient; gradient estimate;

D O I：

暂无

中图分类号：

TP18 [人工智能理论];

学科分类号：

081104 ; 0812 ; 0835 ; 1405 ;

摘要：

This paper presents a new approach for tuning conclusions of fuzzy rules based on reinforcement learning. Unlike the most of existing fuzzy reinforcement learning algorithms which are based on value function, while our approach called policy gradient fuzzy reinforcement learning (PGFRL) bases on gradient estimate. In PGFRL, The algorithm GPOMDP is employed to estimate the performance gradient with respect to the parameters of fuzzy rules. In our work we prove the convergence of fuzzy rules' parameters to a local optimum given necessary conditions. The experiment results show the effectiveness of PGFRL.

引用

页码：992 / 995

页数：4