The Pong game implementation with the FRIQ-learning reinforcement learning algorithm

被引：0

作者：

Tompa, Tamas ^{[1
]}

Vincze, David ^{[1
]}

Kovacs, Szilveszter ^{[1
]}

机构：

[1] Univ Miskolc, Dept Informat Technol, Miskolc, Hungary

来源：

2015 16TH INTERNATIONAL CARPATHIAN CONTROL CONFERENCE (ICCC) | 2015年

关键词：

fuzzy rule-interpolation; reinforcement learning; Q-learning; FRIQ-learning; Pong;

D O I：

暂无

中图分类号：

TP [自动化技术、计算机技术];

学科分类号：

0812 ;

摘要：

This paper introduces a way to control the Pong game automatically with the usage of FRIQ-learning (Fuzzy Rule Interpolation-based Q-learning). The FRIQ-learning method can be a solution to such a problem which has a small state-space. The system starts with an empty knowledge base and the system constructs the final rule-base during the simulation, based on a reward used to solve the task. This way the method can find the required rules using the feedback provided by the environment. To correctly solve the problem the reward-function should be carefully defined for the corresponding problem (handling of the paddle in the Pong game in this case). After determining the required specifications (e.g. the actions and the effects of the actions) we used the FRIQ-learning framework to build a simulation application. FRIQ-learning can gather the required knowledge automatically in the form of a fuzzy rule-base, therefore it can be applied to such a system where the process of the exact operation is unknown. Our main goal is to show that the FRIQ-learning method is suitable to solve this problem by automatically constructing a sparse rule-base for Pong.

引用

页码：542 / 547

页数：6

共 22 条

[1]

[Anonymous], 1989, THESIS

[2]

[Anonymous], 2013, PROC INT C NEURAL IN

[3]

[Anonymous], 2014, Advances in Neural Information Processing Systems

[4]

[Anonymous], 2012, ARXIV12074708

[5]

Appl M., 2000, THESIS

[6] A generalized concept for fuzzy rule interpolation [J].

Baranyi, P ;

Kóczy, LT ;

Gedeon, TD .

IEEE TRANSACTIONS ON FUZZY SYSTEMS, 2004, 12 (06) :820-837

[7]

Bellman RE., 1957, Dynamic Programming

[8]

Berenji HR, 1996, FUZZ-IEEE '96 - PROCEEDINGS OF THE FIFTH IEEE INTERNATIONAL CONFERENCE ON FUZZY SYSTEMS, VOLS 1-3, P2208, DOI 10.1109/FUZZY.1996.553542

[9]

Cobo L.C., 2011, IJCAI Proceedings-International Joint Conference on Artificial Intelligence, V22, P1243

[10]

Cottrell G., 2002, IMITATIVE POLICIES R

← 1 2 3 →