Feature Extraction in Q-Learning using Neural Networks

被引：0

作者：

Zhu, Henghui ^{[1
]}

Paschalidis, Ioannis Ch. ^{[2
,3
,4
]}

Hasselmo, Michael E. ^{[5
]}

机构：

[1] Boston Univ, Ctr Informat & Syst Engn, Boston, MA 02215 USA

[2] Boston Univ, Dept Elect & Comp Engn, 8 St Marys St, Boston, MA 02215 USA

[3] Boston Univ, Div Syst Engn, 8 St Marys St, Boston, MA 02215 USA

[4] Boston Univ, Dept Biomed Engn, 8 St Marys St, Boston, MA 02215 USA

[5] Boston Univ, Ctr Syst Neurosci, Boston, MA 02215 USA

来源：

2017 IEEE 56TH ANNUAL CONFERENCE ON DECISION AND CONTROL (CDC) | 2017年

关键词：

Q-learning; reinforcement learning; Markov decision processes; neural networks;

D O I：

暂无

中图分类号：

TP [自动化技术、计算机技术];

学科分类号：

0812 ;

摘要：

Integrating deep neural networks with reinforcement learning has exhibited excellent performance in the literature, highlighting the ability of neural networks to extract features. This paper begins with a simple Markov decision process inspired from a cognitive task. We show that Q-learning, and approximate Q-learning using a linear function approximation fail in this task. Instead, we show that Q-learning combined with a neural network-based function approximator can learn the optimal policy. Motivated by this finding, we outline procedures that allow the use of a neural network to extract appropriate features, which can then be used in a Q-learning framework with a linear function approximation, obtaining performance similar to that observed using Q-learning with neural networks. Our work suggests that neural networks can be used as feature extractors in the context of Q-learning.

引用

页数：6

共 50 条

[11] Deep Reinforcement Learning: From Q-Learning to Deep Q-Learning [J].

Tan, Fuxiao ;

Yan, Pengfei ;

Guan, Xinping .

NEURAL INFORMATION PROCESSING (ICONIP 2017), PT IV, 2017, 10637 :475-483

[12] Multi-agent reinforcement learning using modular neural network Q-learning algorithms [J].

杨银贤 .

Journal of Chongqing University, 2005, (01) :50-54

[13] Backward Q-learning: The combination of Sarsa algorithm and Q-learning [J].

Wang, Yin-Hao ;

Li, Tzuu-Hseng S. ;

Lin, Chih-Jui .

ENGINEERING APPLICATIONS OF ARTIFICIAL INTELLIGENCE, 2013, 26 (09) :2184-2193

[14] Reinforced feature selection using Q-learning based on collaborative agents [J].

Zhang, Li ;

Jin, Lingbin ;

Gan, Min ;

Zhao, Lei ;

Yin, Hongwei .

INTERNATIONAL JOURNAL OF MACHINE LEARNING AND CYBERNETICS, 2023, 14 (11) :3867-3882

[15] Reinforced feature selection using Q-learning based on collaborative agents [J].

Li Zhang ;

Lingbin Jin ;

Min Gan ;

Lei Zhao ;

Hongwei Yin .

International Journal of Machine Learning and Cybernetics, 2023, 14 :3867-3882

[16] A Tailored Q-Learning for Routing in Wireless Sensor Networks [J].

Sharma, Varun K. ;

Shukla, Shiv Shankar Prasad ;

Singh, Varun .

2012 2ND IEEE INTERNATIONAL CONFERENCE ON PARALLEL, DISTRIBUTED AND GRID COMPUTING (PDGC), 2012, :663-668

[17] Q-Learning approach for minutiae extraction from fingerprint image [J].

Tiwari, Sandeep ;

Sharma, Neha .

2ND INTERNATIONAL CONFERENCE ON COMMUNICATION, COMPUTING & SECURITY [ICCCS-2012], 2012, 1 :82-89

[18] Enhancing Nash Q-learning and Team Q-learning mechanisms by using bottlenecks [J].

Ghazanfari, Behzad ;

Mozayani, Nasser .

JOURNAL OF INTELLIGENT & FUZZY SYSTEMS, 2014, 26 (06) :2771-2783

[19] Automated Portfolio Rebalancing using Q-learning [J].

Darapaneni, Narayana ;

Basu, Amitavo ;

Savla, Sanket ;

Gururajan, Raamanathan ;

Saquib, Najmus ;

Singhavi, Sudarshan ;

Kale, Aishwarya ;

Bid, Pratik ;

Paduri, Anwesh Reddy .

2020 11TH IEEE ANNUAL UBIQUITOUS COMPUTING, ELECTRONICS & MOBILE COMMUNICATION CONFERENCE (UEMCON), 2020, :596-602

[20] Learning rates for Q-learning [J].

Even-Dar, E ;

Mansour, Y .

JOURNAL OF MACHINE LEARNING RESEARCH, 2003, 5 :1-25

← 1 2 3 4 5 →