Monte Carlo tree search control scheme for multibody dynamics applications

被引:0
|
作者
Tang, Yixuan [1 ]
Orzechowski, Grzegorz [1 ]
Prokop, Ales [2 ]
Mikkola, Aki [1 ]
机构
[1] LUT Univ, Dept Mech Engn, Lappeenranta 53850, Finland
[2] Brno Univ Technol, Fac Mech Engn, Technicka 2896-2, Brno 61669, Czech Republic
关键词
Monte Carlo Tree Search; Multibody dynamics; Reward functions; Parametric analysis; Artificial intelligence control; Inverted pendulum; DOUBLE PENDULUM; SWING-UP; STRATEGY; GAME; GO;
D O I
10.1007/s11071-024-09509-8
中图分类号
TH [机械、仪表工业];
学科分类号
0802 ;
摘要
There is considerable interest in applying reinforcement learning (RL) to improve machine control across multiple industries, and the automotive industry is one of the prime examples. Monte Carlo Tree Search (MCTS) has emerged and proven powerful in decision-making games, even without understanding the rules. In this study, multibody system dynamics (MSD) control is first modeled as a Markov Decision Process and solved with Monte Carlo Tree Search. Based on randomized search space exploration, the MCTS framework builds a selective search tree by repeatedly applying a Monte Carlo rollout at each child node. However, without a library of available choices, deciding among the many possibilities for agent parameters can be intimidating. In addition, the MCTS poses a significant challenge for searching due to the large branching factor. This challenge is typically overcome by appropriate parameter design, search guiding, action reduction, parallelization, and early termination. To address these shortcomings, the overarching goal of this study is to provide needed insight into inverted pendulum controls via vanilla and modified MCTS agents, respectively. A series of reward functions are well-designed according to the control goal, which maps a specific distribution shape of reward bonus and guides the MCTS-based control to maintain the upright position. Numerical examples show that the reward-modified MCTS algorithms significantly improve the control performance and robustness of the default choice of a constant reward that constitutes the vanilla MCTS. The exponentially decaying reward functions perform better than the constant value or polynomial reward functions. Moreover, the exploitation vs. exploration trade-off and discount parameters are carefully tested. The study's results can guide the research of RL-based MSD users.
引用
收藏
页码:8363 / 8391
页数:29
相关论文
共 50 条
  • [1] Monte Carlo tree search control scheme for multibody dynamics applications
    Yixuan Tang
    Grzegorz Orzechowski
    Aleš Prokop
    Aki Mikkola
    Nonlinear Dynamics, 2024, 112 : 8363 - 8391
  • [2] Monte Carlo Tree Search: a review of recent modifications and applications
    Swiechowski, Maciej
    Godlewski, Konrad
    Sawicki, Bartosz
    Mandziuk, Jacek
    ARTIFICIAL INTELLIGENCE REVIEW, 2023, 56 (03) : 2497 - 2562
  • [3] Nonasymptotic Analysis of Monte Carlo Tree Search
    Shah, Devavrat
    Xie, Qiaomin
    Xu, Zhi
    OPERATIONS RESEARCH, 2022, 70 (06) : 3234 - 3260
  • [4] Text Matching with Monte Carlo Tree Search
    He, Yixuan
    Tao, Shuchang
    Xu, Jun
    Guo, Jiafeng
    Lan, YanYan
    Cheng, Xueqi
    INFORMATION RETRIEVAL, CCIR 2018, 2018, 11168 : 41 - 52
  • [5] MONTE CARLO TREE SEARCH: A TUTORIAL
    Fu, Michael C.
    2018 WINTER SIMULATION CONFERENCE (WSC), 2018, : 222 - 236
  • [6] Monte Carlo Tree Search: a review of recent modifications and applications
    Maciej Świechowski
    Konrad Godlewski
    Bartosz Sawicki
    Jacek Mańdziuk
    Artificial Intelligence Review, 2023, 56 : 2497 - 2562
  • [7] Beyond games: a systematic review of neural Monte Carlo tree search applications
    Kemmerling, Marco
    Luetticke, Daniel
    Schmitt, Robert H.
    APPLIED INTELLIGENCE, 2024, 54 (01) : 1020 - 1046
  • [8] A TUTORIAL INTRODUCTION TO MONTE CARLO TREE SEARCH
    Fu, Michael C.
    2020 WINTER SIMULATION CONFERENCE (WSC), 2020, : 1178 - 1193
  • [9] Fittest survival: an enhancement mechanism for Monte Carlo tree search
    Zhang, Jiajia
    Sun, Xiaozhen
    Zhang, Dandan
    Wang, Xuan
    Qi, Shuhan
    Qian, Tao
    INTERNATIONAL JOURNAL OF BIO-INSPIRED COMPUTATION, 2021, 18 (02) : 122 - 130
  • [10] On Monte Carlo Tree Search and Reinforcement Learning
    Vodopivec, Tom
    Samothrakis, Spyridon
    Ster, Branko
    JOURNAL OF ARTIFICIAL INTELLIGENCE RESEARCH, 2017, 60 : 881 - 936