Policy gradient fuzzy reinforcement learning

被引:0
作者
Wang, XN [1 ]
Xu, X [1 ]
He, HG [1 ]
机构
[1] Natl Univ Def Technol, Inst Automat, Changsha 410073, Peoples R China
来源
PROCEEDINGS OF THE 2004 INTERNATIONAL CONFERENCE ON MACHINE LEARNING AND CYBERNETICS, VOLS 1-7 | 2004年
关键词
reinforcement learning; fuzzy control; policy gradient; gradient estimate;
D O I
暂无
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
This paper presents a new approach for tuning conclusions of fuzzy rules based on reinforcement learning. Unlike the most of existing fuzzy reinforcement learning algorithms which are based on value function, while our approach called policy gradient fuzzy reinforcement learning (PGFRL) bases on gradient estimate. In PGFRL, The algorithm GPOMDP is employed to estimate the performance gradient with respect to the parameters of fuzzy rules. In our work we prove the convergence of fuzzy rules' parameters to a local optimum given necessary conditions. The experiment results show the effectiveness of PGFRL.
引用
收藏
页码:992 / 995
页数:4
相关论文
共 50 条
[31]   REINFORCEMENT LEARNING OF SPEECH RECOGNITION SYSTEM BASED ON POLICY GRADIENT AND HYPOTHESIS SELECTION [J].
Kato, Taku ;
Shinozaki, Takahiro .
2018 IEEE INTERNATIONAL CONFERENCE ON ACOUSTICS, SPEECH AND SIGNAL PROCESSING (ICASSP), 2018, :5759-5763
[32]   Active structural control framework using policy-gradient reinforcement learning [J].
Eshkevari, Soheila Sadeghi ;
Eshkevari, Soheil Sadeghi ;
Sen, Debarshi ;
Pakzad, Shamim N. .
ENGINEERING STRUCTURES, 2022, 274
[33]   Model-free Reinforcement Learning of Semantic Communication by Stochastic Policy Gradient [J].
Beck, Edgar ;
Bockelmann, Carsten ;
Dekorsy, Armin .
2024 IEEE INTERNATIONAL CONFERENCE ON MACHINE LEARNING FOR COMMUNICATION AND NETWORKING, ICMLCN 2024, 2024, :367-373
[34]   Control Randomisation Approach for Policy Gradient and Application to Reinforcement Learning in Optimal Switching [J].
Denkert, Robert ;
Pham, Huyen ;
Warin, Xavier .
APPLIED MATHEMATICS AND OPTIMIZATION, 2025, 91 (01)
[35]   Continuous Parameter Control in Genetic Algorithms using Policy Gradient Reinforcement Learning [J].
de Miguel Gomez, Alejandro ;
Toosi, Farshad Ghassemi .
PROCEEDINGS OF THE 13TH INTERNATIONAL JOINT CONFERENCE ON COMPUTATIONAL INTELLIGENCE (IJCCI), 2021, :115-122
[36]   QSOD: Hybrid Policy Gradient for Deep Multi-agent Reinforcement Learning [J].
Rehman, Hafiz Muhammad Raza Ur ;
On, Byung-Won ;
Ningombam, Devarani Devi ;
Yi, Sungwon ;
Choi, Gyu Sang .
IEEE ACCESS, 2021, 9 :129728-129741
[37]   Reinforcement Learning for Mobile Robot Obstacle Avoidance with Deep Deterministic Policy Gradient [J].
Chen, Miao ;
Li, Wenna ;
Fei, Shihan ;
Wei, Yufei ;
Tu, Mingyang ;
Li, Jiangbo .
INTELLIGENT ROBOTICS AND APPLICATIONS (ICIRA 2022), PT III, 2022, 13457 :197-204
[38]   Practical Critic Gradient based Actor Critic for On-Policy Reinforcement Learning [J].
Gurumurthy, Swaminathan ;
Manchester, Zachary ;
Kolter, J. Zico .
LEARNING FOR DYNAMICS AND CONTROL CONFERENCE, VOL 211, 2023, 211
[39]   Gradient Monitored Reinforcement Learning [J].
Abdul Hameed, Mohammed Sharafath ;
Chadha, Gavneet Singh ;
Schwung, Andreas ;
Ding, Steven X. .
IEEE TRANSACTIONS ON NEURAL NETWORKS AND LEARNING SYSTEMS, 2023, 34 (08) :4106-4119
[40]   Learning Heuristics for the TSP by Policy Gradient [J].
Deudon, Michel ;
Cournut, Pierre ;
Lacoste, Alexandre ;
Adulyasak, Yossiri ;
Rousseau, Louis-Martin .
INTEGRATION OF CONSTRAINT PROGRAMMING, ARTIFICIAL INTELLIGENCE, AND OPERATIONS RESEARCH, CPAIOR 2018, 2018, 10848 :170-181