Improving Q-learning by using the agent's action history

被引:0
作者
Saito M. [1 ]
Sekozawa T. [2 ]
机构
[1] Graduate School of Engineering, Kanagawa University, 3-27-1, Rokkakubashi, Kanagawa-ku, Yokohama, Kanagawa
[2] Dept. Information Systems Creation, Faculty of Engineering, Kanagawa University, 3-27-1, Rokkakubashi, Kanagawa-ku, Yokohama, Kanagawa
关键词
Action history; Action select; Machine learning; Q-learning; Reinforcement learning; Tabu search;
D O I
10.1541/ieejeiss.136.1209
中图分类号
学科分类号
摘要
Q-learning is learning the optimal policy by updating in action-state value function(Q-value) to maximize a expectation reward by a trial and error search.However, there is major issues slowness of learning speed.Therefore, we added technique agent memorize environmental information and useing with update of the Q-value in many states. By updating the Q-value in the number of conditions to give a lot of information to the agent, be able to reduce learning time. Further, by incorporating the stored environmental information into action selection method, and the action selection to avoid the failure behavior, such as learning to stagnation,improved the learning speed of learning the initial stage.In addition, we design a new action area value function, in order to search for much more statas from the learning initial.Finally, numerical examples which solved maze problem showed the usefulness of the proposed method. © 2016 The Institute of Electrical Engineers of Japan.
引用
收藏
页码:1209 / 1217
页数:8
相关论文
共 50 条
  • [41] Double Q-learning Agent for Othello Board Game
    Somasundaram, Thamarai Selvi
    Panneerselvam, Karthikeyan
    Bhuthapuri, Tarun
    Mahadevan, Harini
    Jose, Ashik
    2018 10TH INTERNATIONAL CONFERENCE ON ADVANCED COMPUTING (ICOAC), 2018, : 216 - 223
  • [42] Learning to Play Pac-Xon with Q-Learning and Two Double Q-Learning Variants
    Schilperoort, Jits
    Mak, Ivar
    Drugan, Madalina M.
    Wiering, Marco A.
    2018 IEEE SYMPOSIUM SERIES ON COMPUTATIONAL INTELLIGENCE (IEEE SSCI), 2018, : 1151 - 1158
  • [43] Double action Q-learning for obstacle avoidance in a dynamically changing environment
    Ngai, DCK
    Yung, NHC
    2005 IEEE Intelligent Vehicles Symposium Proceedings, 2005, : 211 - 216
  • [44] Oil Production Optimization Using Q-Learning Approach
    Zahedi-Seresht, Mazyar
    Sadeghi Bigham, Bahram
    Khosravi, Shahrzad
    Nikpour, Hoda
    PROCESSES, 2024, 12 (01)
  • [45] Model based path planning using Q-Learning
    Sharma, Avinash
    Gupta, Kanika
    Kumar, Anirudha
    Sharma, Aishwarya
    Kumar, Rajesh
    2017 IEEE INTERNATIONAL CONFERENCE ON INDUSTRIAL TECHNOLOGY (ICIT), 2017, : 837 - 842
  • [46] BEAM MANAGEMENT SOLUTION USING Q-LEARNING FRAMEWORK
    Araujo, Daniel C.
    de Almeida, Andre L. F.
    2019 IEEE 8TH INTERNATIONAL WORKSHOP ON COMPUTATIONAL ADVANCES IN MULTI-SENSOR ADAPTIVE PROCESSING (CAMSAP 2019), 2019, : 594 - 598
  • [47] Solving Twisty Puzzles Using Parallel Q-learning
    Hukmani, Kavish
    Kolekar, Sucheta
    Vobugari, Sreekumar
    ENGINEERING LETTERS, 2021, 29 (04) : 1535 - 1543
  • [48] Feature Extraction in Q-Learning using Neural Networks
    Zhu, Henghui
    Paschalidis, Ioannis Ch.
    Hasselmo, Michael E.
    2017 IEEE 56TH ANNUAL CONFERENCE ON DECISION AND CONTROL (CDC), 2017,
  • [50] Fuzzy Q-learning in continuous state and action space
    Xu M.-L.
    Xu W.-B.
    Journal of China Universities of Posts and Telecommunications, 2010, 17 (04): : 100 - 109