Imitation Reinforcement Learning-Based Remote Rotary Inverted Pendulum Control in OpenFlow Network

被引：20

作者：

Kim, Ju-Bong ^{[1
]}

Lim, Hyun-Kyo ^{[2
]}

Kim, Chan-Myung ^{[3
]}

Kim, Min-Suk ^{[4
]}

Hong, Yong-Geun ^{[4
]}

Han, Youn-Hee ^{[1
]}

机构：

[1] Korea Univ Technol & Educ, Dept Comp Sci Engn, Cheonan 31253, South Korea

[2] Korea Univ Technol & Educ, Dept Interdisciplinary Program Creat Engn, Cheonan 31253, South Korea

[3] Adv Technol Res Inst, Cheonan 31253, South Korea

[4] Elect & Telecommun Res Inst, Dept Knowledge Converged Super Brain Convergence, Daejeon 34129, South Korea

来源：

IEEE ACCESS | 2019年 / 7卷

基金：

新加坡国家研究基金会;

关键词：

Reinforcement learning; remote control; control engineering; OpenFlow; CPS;

D O I：

10.1109/ACCESS.2019.2905621

中图分类号：

TP [自动化技术、计算机技术];

学科分类号：

0812 ;

摘要：

Rotary inverted pendulum is an unstable and highly nonlinear device and has been used as a common application model in nonlinear control engineering field. In this paper, we use a rotary inverted pendulum as a deep reinforcement learning environment. The real device is composed of a cyber environment and physical environment based on the OpenFlow network, and the MQTT protocol is used on the Ethernet connection to connect the cyber environment and the physical environment. The reinforcement learning agent is learned to control the real device located remotely from the controller, and the classical PID controller is also utilized to implement the imitation reinforcement learning and facilitate the learning process. From our CPS-based experimental system, we verify that a deep reinforcement learning agent can successfully control the real device located remotely from the agent, and our imitation learning strategy can make the learning time reduced effectively.

引用

页码：36682 / 36690

页数：9

共 22 条

[1] Deep Reinforcement Learning A brief survey [J].

Arulkumaran, Kai ;

Deisenroth, Marc Peter ;

Brundage, Miles ;

Bharath, Anil Anthony .

IEEE SIGNAL PROCESSING MAGAZINE, 2017, 34 (06) :26-38

[2]

Benzekki K., 2017, SECURITY COMMUNICATI, V9, P5803

[3]

Berde Pankaj, 2014, P 3 WORKSHOP HOT TOP, P1, DOI 10.1145

[4]

Cornish Christopher John, 1989, (Ph.D. thesis

[5] Internet of Things (IoT): A vision, architectural elements, and future directions [J].

Gubbi, Jayavardhana ;

Buyya, Rajkumar ;

Marusic, Slaven ;

Palaniswami, Marimuthu .

FUTURE GENERATION COMPUTER SYSTEMS-THE INTERNATIONAL JOURNAL OF ESCIENCE, 2013, 29 (07) :1645-1660

[6] Reinforcement learning: A survey [J].

Kaelbling, LP ;

Littman, ML ;

Moore, AW .

JOURNAL OF ARTIFICIAL INTELLIGENCE RESEARCH, 1996, 4 :237-285

[7]

Kober J, 2010, IEEE ROBOT AUTOM MAG, V17, P55, DOI 10.1109/MRA.2010.936952

[8] Software-Defined Networking: A Comprehensive Survey [J].

Kreutz, Diego ;

Ramos, Fernando M. V. ;

Verissimo, Paulo Esteves ;

Rothenberg, Christian Esteve ;

Azodolmolky, Siamak ;

Uhlig, Steve .

PROCEEDINGS OF THE IEEE, 2015, 103 (01) :14-76

[9]

Lee EA, 2008, ISORC 2008: 11TH IEEE SYMPOSIUM ON OBJECT/COMPONENT/SERVICE-ORIENTED REAL-TIME DISTRIBUTED COMPUTING - PROCEEDINGS, P363, DOI 10.1109/ISORC.2008.25

[10]

Lin W., 2008, IEEE INTELEC Conf. Rec, P1

← 1 2 3 →