SOFT ACTOR-CRITIC ALGORITHM WITH ADAPTIVE NORMALIZATION

被引:0
作者
Gao, Xiaonan [1 ]
Wu, Ziyi [1 ]
Zhu, Xianchao [1 ]
Cai, Lei [2 ]
机构
[1] Henan Univ Technol, Sch Artificial Intelligence & Big Data, Zhengzhou 450001, Peoples R China
[2] Henan Inst Sci & Technol, Sch Artificial Intelligence, Xinxiang 453003, Peoples R China
来源
JOURNAL OF NONLINEAR FUNCTIONAL ANALYSIS | 2025年 / 2025卷
基金
中国国家自然科学基金;
关键词
Adaptive normalization; Deep reinforcement learning; Reward mechanism; Soft actor-critic algorithm; GAME; GO;
D O I
10.23952/jnfa.2025.6
中图分类号
O29 [应用数学];
学科分类号
070104 ;
摘要
In recent years, breakthroughs were made in the field of deep reinforcement learning, but, their applications in the real world were seriously affected due to the instability of algorithms and the difficulty in ensuring convergence. As a typical algorithm in reinforcement learning, although the SAC algorithm enhances the robustness and agent's exploration ability by introducing the concept of maximum entropy, it still has the disadvantage of instability in the training process. In order to solve the problems, this paper proposes an Adaptive Normalization-based SAC (AN-SAC) algorithm. By introducing the adaptive normalized reward mechanism into the SAC algorithm, our method can dynamically adjust the normalized parameters of the reward during the training process so that the reward value has zero mean and unit variance. Thus it better adapts to the reward distribution and improves the performance and stability of the algorithm. Experimental results demonstrate that the performance and stability of the AN-SAC algorithm are significantly improved compared with the SAC algorithm.
引用
收藏
页数:10
相关论文
共 50 条
  • [31] Soft actor-critic with automatically adjusted entropy for autonomous exploring in unknown environments
    Jebrane, Walid
    Akchioui, Nabil El
    INTERNATIONAL JOURNAL OF VEHICLE PERFORMANCE, 2025, 11 (01) : 79 - 104
  • [32] Soft Actor-Critic optimization for efficient NOMA uplink in intelligent vehicular networks
    Pi, Peng
    Ren, Guangyuan
    PHYSICAL COMMUNICATION, 2025, 68
  • [33] When Visible Light Communication Meets RIS: A Soft Actor-Critic Approach
    Zhang, Long
    Jia, Xingliang
    Tian, Ni
    Hong, Choong Seon
    Han, Zhu
    IEEE WIRELESS COMMUNICATIONS LETTERS, 2024, 13 (04) : 1208 - 1212
  • [34] Real-time route planning of unmanned aerial vehicles based on improved soft actor-critic algorithm
    Zhou, Yuxiang
    Shu, Jiansheng
    Zheng, Xiaolong
    Hao, Hui
    Song, Huan
    FRONTIERS IN NEUROROBOTICS, 2022, 16
  • [35] A UAV-Centric Improved Soft Actor-Critic Algorithm for QoE-Focused Aerial Video Streaming
    Yaqoob, Abid
    Yuan, Zhenhui
    Muntean, Gabriel-Miro
    IEEE TRANSACTIONS ON VEHICULAR TECHNOLOGY, 2024, 73 (09) : 13498 - 13512
  • [36] Quadruped robot locomotion via soft actor-critic with muti-head critic and dynamic policy gradientQuadruped robot locomotion via soft actor-critic with muti-head critic...Y. Fan et al.
    Yanan Fan
    Zhongcai Pei
    Hongbing Shi
    Meng Li
    Tianyuan Guo
    Zhiyong Tang
    Applied Intelligence, 2025, 55 (10)
  • [37] Self-learning adaptive power management scheme for energy-efficient IoT-MEC systems using soft actor-critic algorithm
    Rahmani, Amir Masoud
    Haider, Amir
    Moghaddasi, Komeil
    Gharehchopogh, Farhad Soleimanian
    Aurangzeb, Khursheed
    Liu, Zhe
    Hosseinzadeh, Mehdi
    INTERNET OF THINGS, 2025, 31
  • [38] Actor-Critic Deep Reinforcement Learning for Solving Job Shop Scheduling Problems
    Liu, Chien-Liang
    Chang, Chuan-Chin
    Tseng, Chun-Jan
    IEEE ACCESS, 2020, 8 : 71752 - 71762
  • [39] Actor-Critic based Improper Reinforcement Learning
    Zaki, Mohammadi
    Mohan, Avinash
    Gopalan, Aditya
    Mannor, Shie
    INTERNATIONAL CONFERENCE ON MACHINE LEARNING, VOL 162, 2022,
  • [40] The Proposal of Double Agent Architecture using Actor-critic Algorithm for Penetration Testing
    Nguyen, Hoang Viet
    Teerakanok, Songpon
    Inomata, Atsuo
    Uehara, Tetsutaro
    ICISSP: PROCEEDINGS OF THE 7TH INTERNATIONAL CONFERENCE ON INFORMATION SYSTEMS SECURITY AND PRIVACY, 2021, : 440 - 449