Interactive Reinforcement Learning Strategy

被引：1

作者：

Shi, Zhenjie ^{[1
]}

Ma, Wenming ^{[1
]}

Yin, Shuai ^{[1
]}

Zhang, Hailiang ^{[1
]}

Zhao, Xiaofan ^{[1
]}

机构：

[1] Yantai Univ, Sch Comp & Control Engn, Yantai, Peoples R China

来源：

2021 IEEE SMARTWORLD, UBIQUITOUS INTELLIGENCE & COMPUTING, ADVANCED & TRUSTED COMPUTING, SCALABLE COMPUTING & COMMUNICATIONS, INTERNET OF PEOPLE, AND SMART CITY INNOVATIONS (SMARTWORLD/SCALCOM/UIC/ATC/IOP/SCI 2021) | 2021年

关键词：

Reinforcement learning; interactive learning; path planning; Q-learning;

D O I：

10.1109/SWC50871.2021.00075

中图分类号：

TP18 [人工智能理论];

学科分类号：

081104 ; 0812 ; 0835 ; 1405 ;

摘要：

The birth of AlphaGo has set off a new wave of reinforcement learning technology. Reinforcement learning has become one of the most popular directions in the field of artificial intelligence. Its essence is the continuous integration and upgrading of various machine learning methods, and the agents continue to trial and error and obtain cumulative rewards. Q-learning is the most commonly used method in reinforcement learning, but it itself has many problems such as less early information, long learning time, low learning efficiency, and repeated trial and error. Therefore, Q-learning cannot be directly applied to the real environment. In response to this problem, the reinforcement learning discussed by the author is an interactive learning method that combines voice commands and Q-learning. This method uses part of the interaction between the agent and the human voice to find a larger target range in the early stage of learning. Then narrow the search range in turn, which can guide the agent to quickly achieve the learning effect and change the blindness of learning. Simulation experiments show that compared with the standard Q-learning algorithm, the proposed algorithm not only improves the convergence speed, shortens the learning time, but also reduces the number of collisions, enabling the agent to quickly find a better collision-free path.

引用

页码：507 / 512

页数：6

共 50 条

[41] Interactive Reinforcement Learning for Feature Selection With Decision Tree in the Loop
Fan, Wei
Liu, Kunpeng
Liu, Hao
Ge, Yong
Xiong, Hui
Fu, Yanjie
IEEE TRANSACTIONS ON KNOWLEDGE AND DATA ENGINEERING, 2023, 35 (02) : 1624 - 1636
[42] Reinforcement Learning for Interactive QoS-Aware Services Composition
Alizadeh, Pegah
Osmani, Aomar
Khanouche, Mohamed Essaid
Chibani, Abdelghani
Amirat, Yacine
IEEE SYSTEMS JOURNAL, 2021, 15 (01): : 1098 - 1108
[43] Effect of Interaction Design on the Human Experience with Interactive Reinforcement Learning
Krening, Samantha
Feigh, Karen M.
PROCEEDINGS OF THE 2019 ACM DESIGNING INTERACTIVE SYSTEMS CONFERENCE (DIS 2019), 2019, : 1089 - 1100
[44] Knowledge-guided Deep Reinforcement Learning for Interactive Recommendation
Chen, Xiaocong
Huang, Chaoran
Yao, Lina
Wang, Xianzhi
Liu, Wei
Zhang, Wenjie
2020 INTERNATIONAL JOINT CONFERENCE ON NEURAL NETWORKS (IJCNN), 2020,
[45] Multi-day Residential EV Charging Strategy Using Reinforcement Learning
Goh, Dominic
Sokolowski, Peter
Jalili, Mahdi
PROCEEDINGS OF 2021 IEEE 30TH INTERNATIONAL SYMPOSIUM ON INDUSTRIAL ELECTRONICS (ISIE), 2021,
[46] Joint Reinforcement Learning Method Based on Roulette Algorithm and Simulated Annealing Strategy
Hu Jin-bo
Yang Rui-jun
Cheng Yan
2020 5TH INTERNATIONAL CONFERENCE ON INTELLIGENT INFORMATICS AND BIOMEDICAL SCIENCES (ICIIBMS 2020), 2020, : 34 - 37
[47] Reinforcement learning path planning algorithm based on obstacle area expansion strategy
Haiyang Chen
Yebiao Ji
Longhui Niu
Intelligent Service Robotics, 2020, 13 : 289 - 297
[48] Deep Reinforcement Learning-Assisted Teaching Strategy for Industrial Robot Manipulator
Simon, Janos
Gogolak, Laszlo
Sarosi, Jozsef
APPLIED SCIENCES-BASEL, 2024, 14 (23):
[49] Reinforcement learning path planning algorithm based on obstacle area expansion strategy
Chen, Haiyang
Ji, Yebiao
Niu, Longhui
INTELLIGENT SERVICE ROBOTICS, 2020, 13 (02) : 289 - 297
[50] Jamming strategy learning based on positive reinforcement learning and orthogonal decomposition
Zhuansun S.
Yang J.
Liu H.
Huang K.
Xi Tong Gong Cheng Yu Dian Zi Ji Shu/Systems Engineering and Electronics, 2018, 40 (03): : 518 - 525

← 1 2 3 4 5 →