Hardware implementation of the upper confidence-bound algorithm for reinforcement learning

被引:4
作者
Radovic, Nevena [1 ]
Erceg, Milena [1 ]
机构
[1] Univ Montenegro, Elect Engn Dept, Cetinjski Put Bb, Podgorica 81000, Montenegro
关键词
FPGA; Hardware implementation; Machine learning; Multi-armed bandit problem; Upper confidence-bound algorithm; ARCHITECTURE;
D O I
10.1016/j.compeleceng.2021.107537
中图分类号
TP3 [计算技术、计算机技术];
学科分类号
0812 ;
摘要
The upper confidence-bound algorithm has been identified as a popular and useful approach in reinforcement learning, suitable for solving diverse modern-day problems. In this paper, we have developed efficient, multiple-clock-cycle hardware for this algorithm to ensure its practical application in real-time. The real-life situation that belongs to a class of problems commonly known as multi-armed bandit problems has been observed. The developed design is tested and verified by a field-programmable gate array circuit design. The obtained results have the degree of accuracy of the ones achieved in software simulation, which proofs the robustness of the developed solution. In terms of execution time, the proposed hardware implementation signifi-cantly outperforms the software simulation. Finally, the calculation complexity of the imple-mentation does not depend on the number of observed iterations, which guarantees the effective implementation of the developed design. All implementation details have been provided.
引用
收藏
页数:9
相关论文
共 50 条
  • [21] Whale Algorithm for Image Processing, A Hardware Implementation
    Zakerhaghighi, Mohammad Reza
    Naji, Hamid Reza
    2013 8TH IRANIAN CONFERENCE ON MACHINE VISION & IMAGE PROCESSING (MVIP 2013), 2013, : 355 - 359
  • [22] Hardware Implementation of Census Stereo Matching Algorithm
    Qiao, Shijie
    Yang, Jiawei
    Meng, Lei
    Yan, Shuo
    2019 IEEE INTERNATIONAL CONFERENCE ON ELECTRON DEVICES AND SOLID-STATE CIRCUITS (EDSSC), 2019,
  • [23] Decimal Square Root: Algorithm and Hardware Implementation
    Hosseiny, Adel
    Jaberipur, Ghassem
    CIRCUITS SYSTEMS AND SIGNAL PROCESSING, 2016, 35 (12) : 4195 - 4219
  • [24] Hardware implementation of the MD5 algorithm
    Pamula, D.
    Ziebinski, A.
    IFAC WORKSHOP ON PROGRAMMABLE DEVICES AND EMBEDDED SYSTEMS (PDES 2009), PROCEEDINGS, 2009, : 45 - 50
  • [25] Reconfigurable hardware implementation of K-nearest neighbor algorithm on FPGA
    Yacoub, Mohammed H.
    Ismail, Samar M.
    Said, Lobna A.
    Madian, Ahmed H.
    Radwan, Ahmed G.
    AEU-INTERNATIONAL JOURNAL OF ELECTRONICS AND COMMUNICATIONS, 2024, 173
  • [26] Hardware Implementation of FAST Algorithm for Mobile Applications
    Soberl, Domen
    Zimic, Nikolaj
    Leonardis, Ales
    Krivic, Jaka
    Moskon, Miha
    JOURNAL OF SIGNAL PROCESSING SYSTEMS FOR SIGNAL IMAGE AND VIDEO TECHNOLOGY, 2015, 79 (03): : 247 - 256
  • [27] FPGA based hardware implementation of Bat Algorithm
    Ben Ameur, Mohamed Sadok
    Sakly, Anis
    APPLIED SOFT COMPUTING, 2017, 58 : 378 - 387
  • [28] A Hardware Implementation of the PID Algorithm Using Floating-Point Arithmetic
    Kulisz, Jozef
    Jokiel, Filip
    ELECTRONICS, 2024, 13 (08)
  • [29] A Hardware-Oriented IME Algorithm for HEVC and Its Hardware Implementation
    Fan, Yibo
    Huang, Leilei
    Hao, Bei
    Zeng, Xiaoyang
    IEEE TRANSACTIONS ON CIRCUITS AND SYSTEMS FOR VIDEO TECHNOLOGY, 2018, 28 (08) : 2048 - 2057
  • [30] Pairwise Regression with Upper Confidence Bound for Contextual Bandit with Multiple Actions
    Chang, Ya-Hsuan
    Lin, Hsuan-Tien
    2013 CONFERENCE ON TECHNOLOGIES AND APPLICATIONS OF ARTIFICIAL INTELLIGENCE (TAAI), 2013, : 19 - 24