Deep reinforcement learning with reward shaping for tracking control and vibration suppression of flexible link manipulator

被引:12
作者
Viswanadhapalli, Joshi Kumar [1 ]
Elumalai, Vinodh Kumar [1 ]
Shivram, S. [1 ]
Shah, Sweta [1 ]
Mahajan, Dhruv [1 ]
机构
[1] Vellore Inst Technol, Sch Elect Engn, Vellore 632014, Tamilnadu, India
关键词
Deep reinforcement learning; Deep deterministic policy gradient; Adaptive Kalman filter; Flexible link; Vibration control; IMPLEMENTATION; SYSTEM; IDENTIFICATION; FEEDBACK;
D O I
10.1016/j.asoc.2023.110756
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
This paper puts forward a novel deep reinforcement learning control using deep deterministic policy gradient (DRLC-DDPG) framework to address the reference tracking and vibration suppression problem of rotary flexible link (RFL) manipulator. Specifically, this study attempts to address the continuous action space DRLC problem through DDPG algorithm and presents a Lyapunov function based reward shaping approach for guaranteed deep reinforcement learning (DRL) convergence and enhanced speed of training. The proposed approach synthesizes the hard and soft constraints of the flexible manipulator as a constrained Markov decision problem (MDP) and evaluates the performance of DRLC-DDPG framework through hardware in loop (HIL) testing to realize precise servo tracking and suppressed vibration of the flexible manipulator. For identifying the dynamical model of the RFL, an empirical Auto-Regressive eXogenous (ARX) model using the closed loop identification technique is built. Moreover, to extract the true states (servo angle and deflection angle) from the actual measurements, which typically have the influence of sensor noise, an adaptive Kalman filter (AKF) is augmented with the DRLC scheme. The experimental results of DRLC-DDPG scheme compared with those of the model predictive control (MPC) for several test cases reveal that the proposed scheme is superior to MPC both in terms of trajectory tracking and robustness against the external disturbances and model uncertainty. (c) 2023 Elsevier B.V. All rights reserved.
引用
收藏
页数:17
相关论文
共 44 条
  • [1] Al-Saggaf UM, 2017, INT J CONTROL AUTOM, V15, P2561
  • [2] Deep reinforcement learning approach for MPPT control of partially shaded PV systems in Smart Grids
    Avila, Luis
    De Paula, Mariano
    Trimboli, Maximiliano
    Carlucho, Ignacio
    [J]. APPLIED SOFT COMPUTING, 2020, 97
  • [3] Policy invariant explicit shaping: an efficient alternative to reward shaping
    Behboudian, Paniz
    Satsangi, Yash
    Taylor, Matthew E.
    Harutyunyan, Anna
    Bowling, Michael
    [J]. NEURAL COMPUTING & APPLICATIONS, 2022, 34 (03) : 1673 - 1686
  • [4] Reward criteria impact on the performance of reinforcement learning agent for autonomous navigation
    Dayal, Aveen
    Cenkeramaddi, Linga Reddy
    Jha, Ajit
    [J]. APPLIED SOFT COMPUTING, 2022, 126
  • [5] Intelligent wind farm control via deep reinforcement learning and high-fidelity simulations
    Dong, Hongyang
    Zhang, Jincheng
    Zhao, Xiaowei
    [J]. APPLIED ENERGY, 2021, 292
  • [6] Variance aware reward smoothing for deep reinforcement learning
    Dong, Yunlong
    Zhang, Shengjun
    Liu, Xing
    Zhang, Yu
    Shen, Tan
    [J]. NEUROCOMPUTING, 2021, 458 : 327 - 335
  • [7] Principled reward shaping for reinforcement learning via lyapunov stability theory
    Dong, Yunlong
    Tang, Xiuchuan
    Yuan, Ye
    [J]. NEUROCOMPUTING, 2020, 393 : 83 - 90
  • [8] Inverse dynamics based control system for a three-degree-of-freedom flexible arm
    Feliu, V
    Somolinos, JA
    García, A
    [J]. IEEE TRANSACTIONS ON ROBOTICS AND AUTOMATION, 2003, 19 (06): : 1007 - 1014
  • [9] Passivity-based control of single-link flexible manipulators using a linear strain feedback
    Feliu, Vicente
    Pereira, Emiliano
    Diaz, Ivan M.
    [J]. MECHANISM AND MACHINE THEORY, 2014, 71 : 191 - 208
  • [10] Flexible-link robots with combined trajectory tracking and vibration control
    Garcia-Perez, O. A.
    Silva-Navarro, G.
    Peza-Solis, J. F.
    [J]. APPLIED MATHEMATICAL MODELLING, 2019, 70 : 285 - 298