A new hybrid learning control system for robots based on spiking neural networks

被引:1
作者
Azimirad, Vahid [1 ]
Khodkam, S. Yaser [2 ]
Bolouri, Amir [3 ]
机构
[1] Univ Kent, Sch Engn, Canterbury, England
[2] Univ Tabriz, Fac Mech Engn, Tabriz, Iran
[3] Univ West England, Fac Engn, Bristol, England
关键词
Spiking neural networks; Reinforcement learning; Robot controller; Dopamine modulated spike timing depending; plasticity; Fractional Order PID (FOPID); Feedback linearization; MODEL;
D O I
10.1016/j.neunet.2024.106656
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
This paper presents a new hybrid learning and control method that can tune their parameters based on reinforcement learning. In the new proposed method, nonlinear controllers are considered multi-input multi-output functions and then the functions are replaced with SNNs with reinforcement learning algorithms. Dopamine- modulated spike-timing-dependent plasticity (STDP) is used for reinforcement learning and manipulating the synaptic weights between the input and output of neuronal groups (for parameter adjustment). Details of the method are presented and some case studies are done on nonlinear controllers such as Fractional Order PID (FOPID) and Feedback Linearization. The structure and the dynamic equations for learning are presented, and the proposed algorithm is tested on robots and results are compared with other works. Moreover, to demonstrate the effectiveness of SNNFOPID, we conducted rigorous testing on a variety of systems including a two-wheel mobile robot, a double inverted pendulum, and a four-link manipulator robot. The results revealed impressively low errors of 0.01 m, 0.03 rad, and 0.03 rad for each system, respectively. The method is tested on another controller named Feedback Linearization, which provides acceptable results. Results show that the new method has better performance in terms of Integral Absolute Error (IAE) and is highly useful in hardware implementation due to its low energy consumption, high speed, and accuracy. The duration necessary for achieving full and stable proficiency in the control of various robotic systems using SNNFOPD, and SNNFL on an Asus Core i5 system within Simulink's Simscape environment is as follows: - Two-link robot manipulator with SNNFOPID: 19.85656 hours - Two-link robot manipulator with SNNFL: 0.45828 hours - Double inverted pendulum with SNNFOPID: 3.455 hours - Mobile robot with SNNFOPID: 3.71948 hours - Four-link robot manipulator with SNNFOPID: 16.6789 hours. This method can be generalized to other controllers and systems like robots.
引用
收藏
页数:32
相关论文
共 52 条
  • [1] Trajectory tracking of differential drive mobile robots using fractional-order proportional-integral-derivative controller design tuned by an enhanced fruit fly optimization
    Abed, Azher M.
    Rashid, Zryan Najat
    Abedi, Firas
    Zeebaree, Subhi R. M.
    Sahib, Mouayad A.
    Mohamad Jawad, Anwar Ja'afar
    Redha Ibraheem, Ghusn Abdul
    Maher, Rami A.
    Abdulkareem, Ahmed Ibraheem
    Ibraheem, Ibraheem Kasim
    Azar, Ahmad Taher
    Al-khaykan, Ameer
    [J]. MEASUREMENT & CONTROL, 2022, 55 (3-4) : 209 - 226
  • [2] [Anonymous], 2010, 4 IFAC WORKSH FRACT
  • [3] Vision-based Learning: A Novel Machine Learning Method based on Convolutional Neural Networks and Spiking Neural Networks
    Azimirad, Vahid
    Sotubadi, Saleh Valizadeh
    Nasirlou, Ali
    [J]. 2021 9TH RSI INTERNATIONAL CONFERENCE ON ROBOTICS AND MECHATRONICS (ICROM), 2021, : 192 - 197
  • [4] A consecutive hybrid spiking-convolutional (CHSC) neural controller for sequential decision making in robots
    Azimirad, Vahid
    Ramezanlou, Mohammad Tayefe
    Sotubadi, Saleh Valizadeh
    Janabi-Sharifi, Farrokh
    [J]. NEUROCOMPUTING, 2022, 490 : 319 - 336
  • [5] Experimental Study of Reinforcement Learning in Mobile Robots Through Spiking Architecture of Thalamo-Cortico-Thalamic Circuitry of Mammalian Brain
    Azimirad, Vahid
    Sani, Mohammad Fattahi
    [J]. ROBOTICA, 2020, 38 (09) : 1558 - 1575
  • [6] Cao JY, 2006, 2006 1ST IEEE CONFERENCE ON INDUSTRIAL ELECTRONICS AND APPLICATIONS, VOLS 1-3, P69
  • [7] Cao JY, 2005, PROCEEDINGS OF 2005 INTERNATIONAL CONFERENCE ON MACHINE LEARNING AND CYBERNETICS, VOLS 1-9, P5686
  • [8] Fractional Order Modeling of a PHWR Under Step-Back Condition and Control of Its Global Power With a Robust PIλDμ Controller
    Das, Saptarshi
    Das, Shantanu
    Gupta, Amitava
    [J]. IEEE TRANSACTIONS ON NUCLEAR SCIENCE, 2011, 58 (05) : 2431 - 2441
  • [9] On the selection of tuning methodology of FOPID controllers for the control of higher order processes
    Das, Saptarshi
    Saha, Suman
    Das, Shantanu
    Gupta, Amitava
    [J]. ISA TRANSACTIONS, 2011, 50 (03) : 376 - 388
  • [10] Fractional order controller robust to time delay variations for water distribution in an irrigation main canal pool
    Feliu-Batlle, V.
    Rivas-Perez, R.
    Castillo-Garcia, F. J.
    [J]. COMPUTERS AND ELECTRONICS IN AGRICULTURE, 2009, 69 (02) : 185 - 197