SOFT ACTOR-CRITIC ALGORITHM WITH ADAPTIVE NORMALIZATION

被引：0

作者：

Gao, Xiaonan ^{[1
]}

Wu, Ziyi ^{[1
]}

Zhu, Xianchao ^{[1
]}

Cai, Lei ^{[2
]}

机构：

[1] Henan Univ Technol, Sch Artificial Intelligence & Big Data, Zhengzhou 450001, Peoples R China

[2] Henan Inst Sci & Technol, Sch Artificial Intelligence, Xinxiang 453003, Peoples R China

来源：

JOURNAL OF NONLINEAR FUNCTIONAL ANALYSIS | 2025年 / 2025卷

基金：

中国国家自然科学基金;

关键词：

Adaptive normalization; Deep reinforcement learning; Reward mechanism; Soft actor-critic algorithm; GAME; GO;

D O I：

10.23952/jnfa.2025.6

中图分类号：

O29 [应用数学];

学科分类号：

070104 ;

摘要：

In recent years, breakthroughs were made in the field of deep reinforcement learning, but, their applications in the real world were seriously affected due to the instability of algorithms and the difficulty in ensuring convergence. As a typical algorithm in reinforcement learning, although the SAC algorithm enhances the robustness and agent's exploration ability by introducing the concept of maximum entropy, it still has the disadvantage of instability in the training process. In order to solve the problems, this paper proposes an Adaptive Normalization-based SAC (AN-SAC) algorithm. By introducing the adaptive normalized reward mechanism into the SAC algorithm, our method can dynamically adjust the normalized parameters of the reward during the training process so that the reward value has zero mean and unit variance. Thus it better adapts to the reward distribution and improves the performance and stability of the algorithm. Experimental results demonstrate that the performance and stability of the AN-SAC algorithm are significantly improved compared with the SAC algorithm.

引用

页数：10

共 50 条

[31] Soft actor-critic with automatically adjusted entropy for autonomous exploring in unknown environments
Jebrane, Walid
Akchioui, Nabil El
INTERNATIONAL JOURNAL OF VEHICLE PERFORMANCE, 2025, 11 (01) : 79 - 104
[32] Soft Actor-Critic optimization for efficient NOMA uplink in intelligent vehicular networks
Pi, Peng
Ren, Guangyuan
PHYSICAL COMMUNICATION, 2025, 68
[33] When Visible Light Communication Meets RIS: A Soft Actor-Critic Approach
Zhang, Long
Jia, Xingliang
Tian, Ni
Hong, Choong Seon
Han, Zhu
IEEE WIRELESS COMMUNICATIONS LETTERS, 2024, 13 (04) : 1208 - 1212
[34] Real-time route planning of unmanned aerial vehicles based on improved soft actor-critic algorithm
Zhou, Yuxiang
Shu, Jiansheng
Zheng, Xiaolong
Hao, Hui
Song, Huan
FRONTIERS IN NEUROROBOTICS, 2022, 16
[35] A UAV-Centric Improved Soft Actor-Critic Algorithm for QoE-Focused Aerial Video Streaming
Yaqoob, Abid
Yuan, Zhenhui
Muntean, Gabriel-Miro
IEEE TRANSACTIONS ON VEHICULAR TECHNOLOGY, 2024, 73 (09) : 13498 - 13512
[36] Quadruped robot locomotion via soft actor-critic with muti-head critic and dynamic policy gradientQuadruped robot locomotion via soft actor-critic with muti-head critic...Y. Fan et al.
Yanan Fan
Zhongcai Pei
Hongbing Shi
Meng Li
Tianyuan Guo
Zhiyong Tang
Applied Intelligence, 2025, 55 (10)
[37] Self-learning adaptive power management scheme for energy-efficient IoT-MEC systems using soft actor-critic algorithm
Rahmani, Amir Masoud
Haider, Amir
Moghaddasi, Komeil
Gharehchopogh, Farhad Soleimanian
Aurangzeb, Khursheed
Liu, Zhe
Hosseinzadeh, Mehdi
INTERNET OF THINGS, 2025, 31
[38] Actor-Critic Deep Reinforcement Learning for Solving Job Shop Scheduling Problems
Liu, Chien-Liang
Chang, Chuan-Chin
Tseng, Chun-Jan
IEEE ACCESS, 2020, 8 : 71752 - 71762
[39] Actor-Critic based Improper Reinforcement Learning
Zaki, Mohammadi
Mohan, Avinash
Gopalan, Aditya
Mannor, Shie
INTERNATIONAL CONFERENCE ON MACHINE LEARNING, VOL 162, 2022,
[40] The Proposal of Double Agent Architecture using Actor-critic Algorithm for Penetration Testing
Nguyen, Hoang Viet
Teerakanok, Songpon
Inomata, Atsuo
Uehara, Tetsutaro
ICISSP: PROCEEDINGS OF THE 7TH INTERNATIONAL CONFERENCE ON INFORMATION SYSTEMS SECURITY AND PRIVACY, 2021, : 440 - 449

← 1 2 3 4 5 →