共 5 条
Value functions for depth-limited solving in zero-sum imperfect-information games
被引:3
|作者:
Kovarik, Vojtech
[1
]
Seitz, Dominik
[1
]
Lisy, Viliam
[1
]
Rudolf, Jan
[1
]
Sun, Shuo
[1
]
Ha, Karel
[1
]
机构:
[1] Czech Tech Univ, Artificial Intelligence Ctr, FEE, Prague, Czech Republic
关键词:
Imperfect information game;
Multiagent reinforcement learning;
Extensive form game;
Partially observable stochastic game;
Depth limited game;
Depth limited solving;
Value function;
Counterfactual regret minimization;
D O I:
10.1016/j.artint.2022.103805
中图分类号:
TP18 [人工智能理论];
学科分类号:
081104 ;
0812 ;
0835 ;
1405 ;
摘要:
We provide a formal definition of depth-limited games together with an accessible and rigorous explanation of the underlying concepts, both of which were previously miss-ing in imperfect-information games. The definition works for an arbitrary (perfect recall) extensive-form game and is not tied to any specific game-solving algorithm. Moreover, this framework unifies and significantly extends three approaches to depth-limited solving that previously existed in extensive-form games and multiagent reinforcement learning but were not known to be compatible. A key ingredient of these depth-limited games is value functions. Focusing on two-player zero-sum imperfect-information games, we show how to obtain optimal value functions and prove that public information provides both necessary and sufficient context for computing them. We provide a domain-independent encoding of the domains that allows for approximating value functions even by simple feed-forward neural networks, which are then able to generalize to unseen parts of the game. We use the resulting value network to implement a depth-limited version of counterfactual re-gret minimization. In three distinct domains, we show that the algorithm's exploitability is roughly linearly dependent on the value network's quality and that it is not difficult to train a value network with which depth-limited CFR's performance is as good as that of CFR with access to the full game.(c) 2022 Published by Elsevier B.V.
引用
收藏
页数:51
相关论文