Design and implementation of a soft Actor-Critic controller for a robotic arm

被引：0

作者：

Kuo, Ping-Huan ^{[1
,2
]}

Huang, Chen-Ting ^{[1
]}

Chang, Chen-Wen ^{[3
]}

Feng, Po-Hsun ^{[3
]}

Lin, Yu-Sian ^{[1
]}

机构：

[1] Natl Chung Cheng Univ, Dept Mech Engn, Chiayi 62102, Taiwan

[2] Natl Chung Cheng Univ, Adv Inst Mfg High tech Innovat AIM HI, Chiayi 62102, Taiwan

[3] Natl Cheng Kung Univ, Dept Elect Engn, Tainan 701401, Taiwan

来源：

ENGINEERING APPLICATIONS OF ARTIFICIAL INTELLIGENCE | 2025年 / 151卷

关键词：

Robotic arm controller; Reinforcement learning; Obstacle avoidance; Six-degree-of-freedom robotic arm; Soft actor-critic algorithm; REINFORCEMENT; SYSTEMS;

D O I：

10.1016/j.engappai.2025.110589

中图分类号：

TP [自动化技术、计算机技术];

学科分类号：

0812 ;

摘要：

In this study, reinforcement learning (RL) was employed to train and control a robotic arm with six degrees of freedom in a point tracking task. The movements of such a robotic arm are not intuitive because they are determined by the rotations of and lengths between its six joints. Given this consideration, RL can be employed for the robotic arm to learn from its mistakes and autonomously explore the effects of joint rotation of its end point position. This approach is mostly used in complex learning tasks, such as those in gaming, robotics, and automated driving. The proposed learning process was verified in a simulator, in which point tracking and obstacle avoidance tasks were performed using RL algorithms. The performance of the soft actor-critic (SAC), proximal policy optimization, advantage actor-critic, and trust region policy optimization algorithms was compared in the point tracking task, and the results indicated that the SAC algorithm outperformed the other algorithms in this task. Therefore, the SAC algorithm was used with an adequately designed reward function to conduct model training, enabling the robotic arm to achieve obstacle avoidance in an object picking task. The established model was verified to allow the robotic arm to learn from failed experiences, enable fast navigation to the target position, and facilitate rapid deployment in future application.

引用

页数：23

共 39 条

[1]

Achiam Joshua., 2020, Spinning Up Documentation Release

[2] Secondary Frequency Control of Microgrids: An Online Reinforcement Learning Approach [J].

Adibi, Mahya ;

van der Woude, Jacob .

IEEE TRANSACTIONS ON AUTOMATIC CONTROL, 2022, 67 (09) :4824-4831

[3] Control with adaptive Q-learning: A comparison for two classical control problems [J].

Araujo, Joao Pedro ;

Figueiredo, Mario A. T. ;

Botto, Miguel Ayala .

ENGINEERING APPLICATIONS OF ARTIFICIAL INTELLIGENCE, 2022, 112

[4] Complex Robotic Manipulation via Graph-Based Hindsight Goal Generation [J].

Bing, Zhenshan ;

Brucker, Matthias ;

Morin, Fabrice O. ;

Li, Rui ;

Su, Xiaojie ;

Huang, Kai ;

Knoll, Alois .

IEEE TRANSACTIONS ON NEURAL NETWORKS AND LEARNING SYSTEMS, 2022, 33 (12) :7863-7876

[5] Entropy adjustment by interpolation for exploration in Proximal Policy Optimization (PPO) [J].

Boudlal, Ayoub ;

Khafaji, Abderahim ;

Elabbadi, Jamal .

ENGINEERING APPLICATIONS OF ARTIFICIAL INTELLIGENCE, 2024, 133

[6] Dynamic Event-Triggered Consensus Control for Interval Type-2 Fuzzy Multi-Agent Systems [J].

Du, Zhenbin ;

Xie, Xiangpeng ;

Qu, Zifang ;

Hu, Yangyang ;

Stojanovic, Vladimir .

IEEE TRANSACTIONS ON CIRCUITS AND SYSTEMS I-REGULAR PAPERS, 2024, 71 (08) :3857-3866

[7] Reinforcement Learning based Optimization for Cobot's Path Generation in Collaborative Tasks [J].

El Zaatari, Shirine ;

Wang, Yuqi ;

Li, Weidong .

PROCEEDINGS OF THE 2021 IEEE 24TH INTERNATIONAL CONFERENCE ON COMPUTER SUPPORTED COOPERATIVE WORK IN DESIGN (CSCWD), 2021, :975-980

[8] Learning in Restless Multiarmed Bandits via Adaptive Arm Sequencing Rules [J].

Gafni, Tomer ;

Cohen, Kobi .

IEEE TRANSACTIONS ON AUTOMATIC CONTROL, 2021, 66 (10) :5029-5036

[9]

Haarnoja T., 2018, Soft actor-critic algorithms and applications

[10]

Haarnoja T, 2018, PR MACH LEARN RES, V80

← 1 2 3 4 →