VALUE FUNCTION ESTIMATION BASED ON AN ERROR GAUSSIAN MIXTURE MODEL

被引：0

作者：

Cui, Delong ^{[1
]}

Peng, Zhiping ^{[1
]}

Li, Qirui ^{[1
]}

He, Jieguang ^{[1
]}

Li, Kaibin ^{[1
]}

Hung, Shangchao ^{[2
,3
]}

机构：

[1] Guangdong Univ Petrochem Technol, Coll Comp & Elect Informat, Maoming 525000, Guangdong, Peoples R China

[2] Fuzhou Univ, Fuzhou Polytech, Fuzhou 350108, Fujian, Peoples R China

[3] Intelligent Technol Res Ctr, Fuzhou 350108, Fujian, Peoples R China

来源：

JOURNAL OF NONLINEAR AND CONVEX ANALYSIS | 2021年 / 22卷 / 09期

基金：

中国国家自然科学基金;

关键词：

Value function estimation; error Gaussian mixture model; Gaussian process regression; reinforcement learning;

D O I：

暂无

中图分类号：

O29 [应用数学];

学科分类号：

070104 ;

摘要：

In reinforcement, exploration and utilization of agents' action selection has always been the key problem. Agents should not only make full use of maximum action, but also explore potential optimal action. Inspired by the exploration and utilization of actions selection, a novel value function exploration algorithm based on an error Gaussian mixture model (EGMM) is proposed in this paper. First, appropriate variables are chosen from error data, and the number of Gaussian components are obtained by optimizing a Bayesian information criterion via the EGMM. Then, the EGMM is used for the fitting and calculation of error data to obtain the conditional error mean to compensate for the output, thus obtaining more accurate results. We test the performance of the designed algorithm via a virtual experimental platform in a cloud computing environment. Experiments demonstrate the proposed algorithm eliminate the influence of non-Gaussian noise on model prediction performance.

引用

页码：1687 / 1702

页数：16

共 31 条

[11]

Huang Bing-qiang, 2007, Computer Engineering, V33, P18

[12]

Lee D, 2013, IEEE SYMP ADAPT DYNA, P93, DOI 10.1109/ADPRL.2013.6614994

[13]

Lin Z., 2018, P 27 INT JOINT C ART, P2433

[14]

[刘全 Liu Quan], 2018, [计算机学报, Chinese Journal of Computers], V41, P1

[15] Deep Reinforcement Learning for Offloading and Resource Allocation in Vehicle Edge Computing and Networks [J].

Liu, Yi ;

Yu, Huimin ;

Xie, Shengli ;

Zhang, Yan .

IEEE TRANSACTIONS ON VEHICULAR TECHNOLOGY, 2019, 68 (11) :11158-11168

[16] Stochastic Double Deep Q-Network [J].

Lv, Pingli ;

Wang, Xuesong ;

Cheng, Yuhu ;

Duan, Ziming .

IEEE ACCESS, 2019, 7 :79446-79454

[17]

Min Hua-qing, 2011, Control Theory & Applications, V28, P256

[18]

Min HQ, 2009, WORLD SUMMIT ON GENETIC AND EVOLUTIONARY COMPUTATION (GEC 09), P421

[19] Human-level control through deep reinforcement learning [J].

Mnih, Volodymyr ;

Kavukcuoglu, Koray ;

Silver, David ;

Rusu, Andrei A. ;

Veness, Joel ;

Bellemare, Marc G. ;

Graves, Alex ;

Riedmiller, Martin ;

Fidjeland, Andreas K. ;

Ostrovski, Georg ;

Petersen, Stig ;

Beattie, Charles ;

Sadik, Amir ;

Antonoglou, Ioannis ;

King, Helen ;

Kumaran, Dharshan ;

Wierstra, Daan ;

Legg, Shane ;

Hassabis, Demis .

NATURE, 2015, 518 (7540) :529-533

[20]

Mohajeri NN, 2013, IRAN CONF ELECTR ENG

← 1 2 3 4 →