Bioinspired actor-critic algorithm for reinforcement learning interpretation with Levy-Brown hybrid exploration strategy

被引:3
作者
Wang, Xiao [1 ,2 ]
Li, Dazi [1 ,3 ]
机构
[1] Beijing Univ Chem Technol, Beijing 100029, Peoples R China
[2] Beijing Univ Chem Technol, Lecturer Coll Informat Sci & Technol, Beijing, Peoples R China
[3] Beijing Univ Chem Technol, Coll Informat Sci & Technol, Beijing 100024, Peoples R China
基金
中国国家自然科学基金;
关键词
Interpretable reinforcement learning; Levy motion; Pareto distribution; Actor-critic; AVOIDANCE; NETWORKS; GAME;
D O I
10.1016/j.neucom.2024.127291
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
Currently, reinforcement learning, the interpretability of the algorithm is a challenge. The lack of interpretability limits the use of reinforcement learning limited when facing agents in the physical world. To improve the interpretability of reinforcement learning, this study proposes a Levy-Brown hybrid strategy to improve the working of the traditional Actor-Critic algorithm. The proposed strategy is bioinspired from the Brown motion and Levy motion in nature; therefore, it can explain the process of data acquisition in the learning process from biological principles. The main idea of this new strategy is to map the Gaussian strategy to the biological Brown motion, and introduce the biological Levy strategy to improve the exploration efficiency. By combining the two strategies, it effectively takes advantage of the Levy strategy to improve exploration speed and the Brown strategy to improve exploration stability. The experiments demonstrate the advantages of the proposed Levy-Brown hybrid strategy, which effectively make best use of the advantages and overcomes the disadvantages of the two strategies.
引用
收藏
页数:16
相关论文
共 44 条
  • [1] Analikwu CV, 2017, INT J INNOV COMPUT I, V13, P1855, DOI 10.24507/ijicic.13.06.1855
  • [2] Arkin RC., 1998, Behavior-Based Robotics
  • [3] Chuang Y.-N., 2023, Efficient XAI Techniques: A Taxonomic Survey
  • [4] Fundamental Limitations in Performance and Interpretability of Common Planar Rigid-Body Contact Models
    Fazeli, Nima
    Zapolsky, Samuel
    Drumwright, Evan
    Rodriguez, Alberto
    [J]. ROBOTICS RESEARCH, 2020, 10 : 555 - 571
  • [5] Explaining Explanations: An Overview of Interpretability of Machine Learning
    Gilpin, Leilani H.
    Bau, David
    Yuan, Ben Z.
    Bajwa, Ayesha
    Specter, Michael
    Kagal, Lalana
    [J]. 2018 IEEE 5TH INTERNATIONAL CONFERENCE ON DATA SCIENCE AND ADVANCED ANALYTICS (DSAA), 2018, : 80 - 89
  • [6] Fuzzy policy reinforcement learning in cooperative multi-robot systems
    Gu, Dongbing
    Yang, Erfu
    [J]. JOURNAL OF INTELLIGENT & ROBOTIC SYSTEMS, 2007, 48 (01) : 7 - 22
  • [7] Guo Z., 2022, Journal of Uncertain Systems, V15, DOI 10.1142/S1752890922300011
  • [8] Reinforcement Learning for Neural Architecture Search in Hyperspectral Unmixing
    Han, Zhu
    Hong, Danfeng
    Gao, Lianru
    Roy, Swalpa Kumar
    Zhang, Bing
    Chanussot, Jocelyn
    [J]. IEEE GEOSCIENCE AND REMOTE SENSING LETTERS, 2022, 19
  • [9] Hansen S., 2016, Using deep Q-learning to control optimization hyperparameters
  • [10] Artificial intelligence-based radiotherapy machine parameter optimization using reinforcement learning
    Hrinivich, William Thomas
    Lee, Junghoon
    [J]. MEDICAL PHYSICS, 2020, 47 (12) : 6140 - 6150