Policy gradient fuzzy reinforcement learning

被引:0
|
作者
Wang, XN [1 ]
Xu, X [1 ]
He, HG [1 ]
机构
[1] Natl Univ Def Technol, Inst Automat, Changsha 410073, Peoples R China
来源
PROCEEDINGS OF THE 2004 INTERNATIONAL CONFERENCE ON MACHINE LEARNING AND CYBERNETICS, VOLS 1-7 | 2004年
关键词
reinforcement learning; fuzzy control; policy gradient; gradient estimate;
D O I
暂无
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
This paper presents a new approach for tuning conclusions of fuzzy rules based on reinforcement learning. Unlike the most of existing fuzzy reinforcement learning algorithms which are based on value function, while our approach called policy gradient fuzzy reinforcement learning (PGFRL) bases on gradient estimate. In PGFRL, The algorithm GPOMDP is employed to estimate the performance gradient with respect to the parameters of fuzzy rules. In our work we prove the convergence of fuzzy rules' parameters to a local optimum given necessary conditions. The experiment results show the effectiveness of PGFRL.
引用
收藏
页码:992 / 995
页数:4
相关论文
共 50 条
  • [1] A policy gradient reinforcement learning algorithm with fuzzy function approximation
    Gu, DB
    Yang, EF
    IEEE ROBIO 2004: Proceedings of the IEEE International Conference on Robotics and Biomimetics, 2004, : 936 - 940
  • [2] Reinforcement Learning to Rank with Pairwise Policy Gradient
    Xu, Jun
    Wei, Zeng
    Xia, Long
    Lan, Yanyan
    Yin, Dawei
    Cheng, Xueqi
    Wen, Ji-Rong
    PROCEEDINGS OF THE 43RD INTERNATIONAL ACM SIGIR CONFERENCE ON RESEARCH AND DEVELOPMENT IN INFORMATION RETRIEVAL (SIGIR '20), 2020, : 509 - 518
  • [3] On the use of the policy gradient and Hessian in inverse reinforcement learning
    Metelli, Alberto Maria
    Pirotta, Matteo
    Restelli, Marcello
    INTELLIGENZA ARTIFICIALE, 2020, 14 (01) : 117 - 150
  • [4] Batch Reinforcement Learning With a Nonparametric Off-Policy Policy Gradient
    Tosatto, Samuele
    Carvalho, Joao
    Peters, Jan
    IEEE TRANSACTIONS ON PATTERN ANALYSIS AND MACHINE INTELLIGENCE, 2022, 44 (10) : 5996 - 6010
  • [5] A Residual Gradient Fuzzy Reinforcement Learning Algorithm for Differential Games
    Awheda, Mostafa D.
    Schwartz, Howard M.
    INTERNATIONAL JOURNAL OF FUZZY SYSTEMS, 2017, 19 (04) : 1058 - 1076
  • [6] A Residual Gradient Fuzzy Reinforcement Learning Algorithm for Differential Games
    Mostafa D. Awheda
    Howard M. Schwartz
    International Journal of Fuzzy Systems, 2017, 19 : 1058 - 1076
  • [7] Hessian matrix distribution for Bayesian policy gradient reinforcement learning
    Ngo Anh Vien
    Yu, Hwanjo
    Chung, TaeChoong
    INFORMATION SCIENCES, 2011, 181 (09) : 1671 - 1685
  • [8] Adaptive Natural Policy Gradient in Reinforcement Learning
    Li, Dazi
    Qiao, Zengyuan
    Song, Tianheng
    Jin, Qibing
    PROCEEDINGS OF 2018 IEEE 7TH DATA DRIVEN CONTROL AND LEARNING SYSTEMS CONFERENCE (DDCLS), 2018, : 605 - 610
  • [9] Traffic Light Control with Policy Gradient-Based Reinforcement Learning
    Tas, Mehmet Bilge Han
    Ozkan, Kemal
    Saricicek, Inci
    Yazici, Ahmet
    32ND IEEE SIGNAL PROCESSING AND COMMUNICATIONS APPLICATIONS CONFERENCE, SIU 2024, 2024,
  • [10] Survey of Deep Reinforcement Learning Based on Value Function and Policy Gradient
    Liu J.-W.
    Gao F.
    Luo X.-L.
    Jisuanji Xuebao/Chinese Journal of Computers, 2019, 42 (06): : 1406 - 1438