Reinforcement learning algorithms with function approximation: Recent advances and applications

被引：135

作者：

Xu, Xin ^{[1
]}

Zuo, Lei ^{[1
]}

Huang, Zhenhua ^{[1
]}

机构：

[1] Natl Univ Def Technol, Coll Mechatron & Automat, Changsha 410073, Hunan, Peoples R China

来源：

INFORMATION SCIENCES | 2014年 / 261卷

基金：

中国国家自然科学基金;

关键词：

Reinforcement learning; Function approximation; Approximate dynamic programming; Learning control; Generalization; ZERO-SUM GAMES; COGNITIVE RADIO; GAUSSIAN-PROCESSES; STATE ABSTRACTION; POLICY ITERATION; GRAPH KERNELS; POWER; CONVERGENCE; TD(LAMBDA); FRAMEWORK;

D O I：

10.1016/j.ins.2013.08.037

中图分类号：

TP [自动化技术、计算机技术];

学科分类号：

0812 ;

摘要：

In recent years, the research on reinforcement learning (RL) has focused on function approximation in learning prediction and control of Markov decision processes (MDPs). The usage of function approximation techniques in RL will be essential to deal with MDPs with large or continuous state and action spaces. In this paper, a comprehensive survey is given on recent developments in RL algorithms with function approximation. From a theoretical point of view, the convergence and feature representation of RL algorithms are analyzed. From an empirical aspect, the performance of different RL algorithms was evaluated and compared in several benchmark learning prediction and learning control tasks. The applications of RL with function approximation are also discussed. At last, future works on RL with function approximation are suggested. (C) 2013 Elsevier Inc. All rights reserved.

引用

页码：1 / 31

页数：31

共 155 条

[1] Reinforcement learning for True Adaptive traffic signal control [J].

Abdulhai, B ;

Pringle, R ;

Karakoulas, GJ .

JOURNAL OF TRANSPORTATION ENGINEERING, 2003, 129 (03) :278-285

[2] Model-free Q-learning designs for linear discrete-time zero-sum games with application to H-infinity control [J].

Al-Tamimi, Asma ;

Lewis, Frank L. ;

Abu-Khalaf, Murad .

AUTOMATICA, 2007, 43 (03) :473-481

[3] Natural gradient works efficiently in learning [J].

Amari, S .

NEURAL COMPUTATION, 1998, 10 (02) :251-276

[4]

Andre D, 2002, EIGHTEENTH NATIONAL CONFERENCE ON ARTIFICIAL INTELLIGENCE (AAAI-02)/FOURTEENTH INNOVATIVE APPLICATIONS OF ARTIFICIAL INTELLIGENCE CONFERENCE (IAAI-02), PROCEEDINGS, P119

[5]

[Anonymous], ADV NEURAL INFORM PR

[6]

[Anonymous], ADV NEURAL INFORM PR

[7]

[Anonymous], REINFORCEMENT LEARNI

[8]

[Anonymous], 1994, ADV NEURAL INFORM PR

[9]

[Anonymous], 2008, Advances in Neural Information Processing Systems, vol21

[10]

[Anonymous], IEEE RSJ INT C HUM R

← 1 2 3 4 5 6 7 8 9 10 →