Deep reinforcement learning with reward shaping for tracking control and vibration suppression of flexible link manipulator

被引：12

作者：

Viswanadhapalli, Joshi Kumar ^{[1
]}

Elumalai, Vinodh Kumar ^{[1
]}

Shivram, S. ^{[1
]}

Shah, Sweta ^{[1
]}

Mahajan, Dhruv ^{[1
]}

机构：

[1] Vellore Inst Technol, Sch Elect Engn, Vellore 632014, Tamilnadu, India

来源：

APPLIED SOFT COMPUTING | 2024年 / 152卷

关键词：

Deep reinforcement learning; Deep deterministic policy gradient; Adaptive Kalman filter; Flexible link; Vibration control; IMPLEMENTATION; SYSTEM; IDENTIFICATION; FEEDBACK;

D O I：

10.1016/j.asoc.2023.110756

中图分类号：

TP18 [人工智能理论];

学科分类号：

081104 ; 0812 ; 0835 ; 1405 ;

摘要：

This paper puts forward a novel deep reinforcement learning control using deep deterministic policy gradient (DRLC-DDPG) framework to address the reference tracking and vibration suppression problem of rotary flexible link (RFL) manipulator. Specifically, this study attempts to address the continuous action space DRLC problem through DDPG algorithm and presents a Lyapunov function based reward shaping approach for guaranteed deep reinforcement learning (DRL) convergence and enhanced speed of training. The proposed approach synthesizes the hard and soft constraints of the flexible manipulator as a constrained Markov decision problem (MDP) and evaluates the performance of DRLC-DDPG framework through hardware in loop (HIL) testing to realize precise servo tracking and suppressed vibration of the flexible manipulator. For identifying the dynamical model of the RFL, an empirical Auto-Regressive eXogenous (ARX) model using the closed loop identification technique is built. Moreover, to extract the true states (servo angle and deflection angle) from the actual measurements, which typically have the influence of sensor noise, an adaptive Kalman filter (AKF) is augmented with the DRLC scheme. The experimental results of DRLC-DDPG scheme compared with those of the model predictive control (MPC) for several test cases reveal that the proposed scheme is superior to MPC both in terms of trajectory tracking and robustness against the external disturbances and model uncertainty. (c) 2023 Elsevier B.V. All rights reserved.

引用

页数：17

共 44 条

[41] Adaptive Distributed Control of a Flexible Manipulator Using an Iterative Learning Scheme [J].