Interactive Reinforcement Learning Strategy

被引:1
作者
Shi, Zhenjie [1 ]
Ma, Wenming [1 ]
Yin, Shuai [1 ]
Zhang, Hailiang [1 ]
Zhao, Xiaofan [1 ]
机构
[1] Yantai Univ, Sch Comp & Control Engn, Yantai, Peoples R China
来源
2021 IEEE SMARTWORLD, UBIQUITOUS INTELLIGENCE & COMPUTING, ADVANCED & TRUSTED COMPUTING, SCALABLE COMPUTING & COMMUNICATIONS, INTERNET OF PEOPLE, AND SMART CITY INNOVATIONS (SMARTWORLD/SCALCOM/UIC/ATC/IOP/SCI 2021) | 2021年
关键词
Reinforcement learning; interactive learning; path planning; Q-learning;
D O I
10.1109/SWC50871.2021.00075
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
The birth of AlphaGo has set off a new wave of reinforcement learning technology. Reinforcement learning has become one of the most popular directions in the field of artificial intelligence. Its essence is the continuous integration and upgrading of various machine learning methods, and the agents continue to trial and error and obtain cumulative rewards. Q-learning is the most commonly used method in reinforcement learning, but it itself has many problems such as less early information, long learning time, low learning efficiency, and repeated trial and error. Therefore, Q-learning cannot be directly applied to the real environment. In response to this problem, the reinforcement learning discussed by the author is an interactive learning method that combines voice commands and Q-learning. This method uses part of the interaction between the agent and the human voice to find a larger target range in the early stage of learning. Then narrow the search range in turn, which can guide the agent to quickly achieve the learning effect and change the blindness of learning. Simulation experiments show that compared with the standard Q-learning algorithm, the proposed algorithm not only improves the convergence speed, shortens the learning time, but also reduces the number of collisions, enabling the agent to quickly find a better collision-free path.
引用
收藏
页码:507 / 512
页数:6
相关论文
共 50 条
  • [41] Interactive Reinforcement Learning for Feature Selection With Decision Tree in the Loop
    Fan, Wei
    Liu, Kunpeng
    Liu, Hao
    Ge, Yong
    Xiong, Hui
    Fu, Yanjie
    IEEE TRANSACTIONS ON KNOWLEDGE AND DATA ENGINEERING, 2023, 35 (02) : 1624 - 1636
  • [42] Reinforcement Learning for Interactive QoS-Aware Services Composition
    Alizadeh, Pegah
    Osmani, Aomar
    Khanouche, Mohamed Essaid
    Chibani, Abdelghani
    Amirat, Yacine
    IEEE SYSTEMS JOURNAL, 2021, 15 (01): : 1098 - 1108
  • [43] Effect of Interaction Design on the Human Experience with Interactive Reinforcement Learning
    Krening, Samantha
    Feigh, Karen M.
    PROCEEDINGS OF THE 2019 ACM DESIGNING INTERACTIVE SYSTEMS CONFERENCE (DIS 2019), 2019, : 1089 - 1100
  • [44] Knowledge-guided Deep Reinforcement Learning for Interactive Recommendation
    Chen, Xiaocong
    Huang, Chaoran
    Yao, Lina
    Wang, Xianzhi
    Liu, Wei
    Zhang, Wenjie
    2020 INTERNATIONAL JOINT CONFERENCE ON NEURAL NETWORKS (IJCNN), 2020,
  • [45] Multi-day Residential EV Charging Strategy Using Reinforcement Learning
    Goh, Dominic
    Sokolowski, Peter
    Jalili, Mahdi
    PROCEEDINGS OF 2021 IEEE 30TH INTERNATIONAL SYMPOSIUM ON INDUSTRIAL ELECTRONICS (ISIE), 2021,
  • [46] Joint Reinforcement Learning Method Based on Roulette Algorithm and Simulated Annealing Strategy
    Hu Jin-bo
    Yang Rui-jun
    Cheng Yan
    2020 5TH INTERNATIONAL CONFERENCE ON INTELLIGENT INFORMATICS AND BIOMEDICAL SCIENCES (ICIIBMS 2020), 2020, : 34 - 37
  • [47] Reinforcement learning path planning algorithm based on obstacle area expansion strategy
    Haiyang Chen
    Yebiao Ji
    Longhui Niu
    Intelligent Service Robotics, 2020, 13 : 289 - 297
  • [48] Deep Reinforcement Learning-Assisted Teaching Strategy for Industrial Robot Manipulator
    Simon, Janos
    Gogolak, Laszlo
    Sarosi, Jozsef
    APPLIED SCIENCES-BASEL, 2024, 14 (23):
  • [49] Reinforcement learning path planning algorithm based on obstacle area expansion strategy
    Chen, Haiyang
    Ji, Yebiao
    Niu, Longhui
    INTELLIGENT SERVICE ROBOTICS, 2020, 13 (02) : 289 - 297
  • [50] Jamming strategy learning based on positive reinforcement learning and orthogonal decomposition
    Zhuansun S.
    Yang J.
    Liu H.
    Huang K.
    Xi Tong Gong Cheng Yu Dian Zi Ji Shu/Systems Engineering and Electronics, 2018, 40 (03): : 518 - 525