Monte Carlo tree search control scheme for multibody dynamics applications

被引：0

作者：

Tang, Yixuan ^{[1
]}

Orzechowski, Grzegorz ^{[1
]}

Prokop, Ales ^{[2
]}

Mikkola, Aki ^{[1
]}

机构：

[1] LUT Univ, Dept Mech Engn, Lappeenranta 53850, Finland

[2] Brno Univ Technol, Fac Mech Engn, Technicka 2896-2, Brno 61669, Czech Republic

来源：

NONLINEAR DYNAMICS | 2024年 / 112卷 / 10期

关键词：

Monte Carlo Tree Search; Multibody dynamics; Reward functions; Parametric analysis; Artificial intelligence control; Inverted pendulum; DOUBLE PENDULUM; SWING-UP; STRATEGY; GAME; GO;

D O I：

10.1007/s11071-024-09509-8

中图分类号：

TH [机械、仪表工业];

学科分类号：

0802 ;

摘要：

There is considerable interest in applying reinforcement learning (RL) to improve machine control across multiple industries, and the automotive industry is one of the prime examples. Monte Carlo Tree Search (MCTS) has emerged and proven powerful in decision-making games, even without understanding the rules. In this study, multibody system dynamics (MSD) control is first modeled as a Markov Decision Process and solved with Monte Carlo Tree Search. Based on randomized search space exploration, the MCTS framework builds a selective search tree by repeatedly applying a Monte Carlo rollout at each child node. However, without a library of available choices, deciding among the many possibilities for agent parameters can be intimidating. In addition, the MCTS poses a significant challenge for searching due to the large branching factor. This challenge is typically overcome by appropriate parameter design, search guiding, action reduction, parallelization, and early termination. To address these shortcomings, the overarching goal of this study is to provide needed insight into inverted pendulum controls via vanilla and modified MCTS agents, respectively. A series of reward functions are well-designed according to the control goal, which maps a specific distribution shape of reward bonus and guides the MCTS-based control to maintain the upright position. Numerical examples show that the reward-modified MCTS algorithms significantly improve the control performance and robustness of the default choice of a constant reward that constitutes the vanilla MCTS. The exponentially decaying reward functions perform better than the constant value or polynomial reward functions. Moreover, the exploitation vs. exploration trade-off and discount parameters are carefully tested. The study's results can guide the research of RL-based MSD users.

引用

页码：8363 / 8391

页数：29

共 50 条

[21] AlphaTruss: Monte Carlo Tree Search for Optimal Truss Layout Design
Luo, Ruifeng
Wang, Yifan
Xiao, Weifang
Zhao, Xianzhong
BUILDINGS, 2022, 12 (05)
[22] Monte Carlo tree search in Kriegspiel
Ciancarini, Paolo
Favini, Gian Piero
ARTIFICIAL INTELLIGENCE, 2010, 174 (11) : 670 - 684
[23] Monte Carlo Tree Search for Quoridor
Respall, Victor Massague
Brown, Joseph Alexander
Aslam, Hamna
19TH INTERNATIONAL CONFERENCE ON INTELLIGENT GAMES AND SIMULATION (GAME-ON(R) 2018), 2018, : 5 - 9
[24] An Analysis of Monte Carlo Tree Search
James, Steven
Konidaris, George
Rosman, Benjamin
THIRTY-FIRST AAAI CONFERENCE ON ARTIFICIAL INTELLIGENCE, 2017, : 3576 - 3582
[25] Monte Carlo tree search for dynamic shortest-path interdiction
Bochkarev, Alexey A.
Smith, J. Cole
NETWORKS, 2024, 84 (04) : 398 - 419
[26] A Timetable Rescheduling Approach for Railway based on Monte Carlo Tree Search
Wang, Rongsheng
Zhou, Min
Li, Yidong
Zhang, Qi
Dong, Hairong
2019 IEEE INTELLIGENT TRANSPORTATION SYSTEMS CONFERENCE (ITSC), 2019, : 3738 - 3743
[27] Multiple Policy Value Monte Carlo Tree Search
Lan, Li-Cheng
Li, Wei
Wei, Ting-Han
Wu, I-Chen
PROCEEDINGS OF THE TWENTY-EIGHTH INTERNATIONAL JOINT CONFERENCE ON ARTIFICIAL INTELLIGENCE, 2019, : 4704 - 4710
[28] Planning spatial networks with Monte Carlo tree search
Darvariu, Victor-Alexandru
Hailes, Stephen
Musolesi, Mirco
PROCEEDINGS OF THE ROYAL SOCIETY A-MATHEMATICAL PHYSICAL AND ENGINEERING SCIENCES, 2023, 479 (2269):
[29] Convex Regularization in Monte-Carlo Tree Search
Dam, Tuan
D'Eramo, Carlo
Peters, Jan
Pajarinen, Joni
INTERNATIONAL CONFERENCE ON MACHINE LEARNING, VOL 139, 2021, 139
[30] Accelerating Monte Carlo Tree Search with Probability Tree State Abstraction
Fu, Yangqing
Sun, Ming
Nie, Buqing
Gao, Yue
ADVANCES IN NEURAL INFORMATION PROCESSING SYSTEMS 36, NEURIPS 2023, 2023,

← 1 2 3 4 5 →