Towards Enabling Deep Learning Techniques for Adaptive Dynamic Programming

被引：0

作者：

Ni, Zhen ^{[1
]}

Malla, Naresh ^{[1
]}

Zhong, Xiangnan ^{[2
]}

机构：

[1] South Dakota State Univ, Elect Engn & Comp Sci Dept, Brookings, SD 57007 USA

[2] Univ Rhode Isl, Dept Elect Comp & Biomed Engn, Kingston, RI 02881 USA

来源：

2017 INTERNATIONAL JOINT CONFERENCE ON NEURAL NETWORKS (IJCNN) | 2017年

关键词：

Deep Learning; deep reinforcement learning (DRL); adaptive dynamic programming (ADP); experience replay; computational intelligence; Markov decision process; TIME NONLINEAR-SYSTEMS; EXPERIENCE REPLAY; NEURAL-NETWORK; GAME;

D O I：

暂无

中图分类号：

TP18 [人工智能理论];

学科分类号：

081104 ; 0812 ; 0835 ; 1405 ;

摘要：

Human-level control through deep learning and deep reinforcement learning have revealed the unique and powerful potentials through a very complex Go game. The AlphaGo, developed by Google DeepMind, has beat the top Go game player early this year. The scientific and technological advancement behind the success of AlphaGo attracted researchers from multiple areas, including machine learning, artificial intelligence, computational intelligence and so on. Adaptive dynamic programming (ADP) methods have the similar fundamental principle with reinforcement learning, and show strong performance for continuous time and continuous state systems. Deep learning techniques are also possible to be integrated for ADP designs. In this paper, we discuss the key techniques and components in deep reinforcement learning and then present the successful applications for computer games and maze navigation. Future opportunities for deep learning enabled ADP will be discussed at the end.

引用

页码：2828 / 2835

页数：8

共 64 条

[1] Experience Replay for Real-Time Reinforcement Learning Control [J].

Adam, Sander ;

Busoniu, Lucian ;

Babuska, Robert .

IEEE TRANSACTIONS ON SYSTEMS MAN AND CYBERNETICS PART C-APPLICATIONS AND REVIEWS, 2012, 42 (02) :201-212

[2] Discrete-time nonlinear HJB solution using approximate dynamic programming: Convergence proof [J].

Al-Tamimi, Asma ;

Lewis, Frank L. ;

Abu-Khalaf, Murad .

IEEE TRANSACTIONS ON SYSTEMS MAN AND CYBERNETICS PART B-CYBERNETICS, 2008, 38 (04) :943-949

[3]

Anderson C.W., 2015, 2015 International Joint Conference on Neural Networks IJCNN, P1, DOI DOI 10.1109/IJCNN.2015.7280824

[4]

[Anonymous], 2015, DEEP REINFORCEMENT L

[5]

[Anonymous], 2015, ARXIV150906461

[6]

[Anonymous], 2016, P 4 INT C LEARN REPR

[7]

[Anonymous], 1998, INTRO REINFORCEMENT

[8]

[Anonymous], GUEST POST 1

[9]

[Anonymous], 2016, IEEE RSJ INT C INT R

[10]

[Anonymous], 2016, ARXIV160300748

← 1 2 3 4 5 6 7 →