Bridge Bidding via Deep Reinforcement Learning and Belief Monte Carlo Search

被引:2
|
作者
Qiu, Zizhang [1 ]
Wang, Shouguang [1 ]
You, Dan [1 ]
Zhou, MengChu [1 ]
机构
[1] Zhejiang Gongshang Univ, Sch Informat & Elect Engn, Hangzhou 310018, Peoples R China
关键词
Bridges; Monte Carlo methods; Supervised learning; Interference; Games; Deep reinforcement learning; Software; Contract Bridge; reinforcement learning; search; GO; ALGORITHM; GAME;
D O I
10.1109/JAS.2024.124488
中图分类号
TP [自动化技术、计算机技术];
学科分类号
0812 ;
摘要
Contract Bridge, a four-player imperfect information game, comprises two phases: bidding and playing. While computer programs excel at playing, bidding presents a challenging aspect due to the need for information exchange with partners and interference with communication of opponents. In this work, we introduce a Bridge bidding agent that combines supervised learning, deep reinforcement learning via self-play, and a test-time search approach. Our experiments demonstrate that our agent outperforms WBridge5, a highly regarded computer Bridge software that has won multiple world championships, by a performance of 0.98 IMPs (international match points) per deal over 10 000 deals, with a much cost-effective approach. The performance significantly surpasses previous state-of-the-art (0.85 IMPs per deal). Note 0.1 IMPs per deal is a significant improvement in Bridge bidding.
引用
收藏
页码:2111 / 2122
页数:12
相关论文
共 50 条
  • [41] Learning to Stop: Dynamic Simulation Monte Carlo Tree Search
    Lan, Li-Cheng
    Wu, Ti-Rong
    Wu, I-Chen
    Hsieh, Cho-Jui
    THIRTY-FIFTH AAAI CONFERENCE ON ARTIFICIAL INTELLIGENCE, THIRTY-THIRD CONFERENCE ON INNOVATIVE APPLICATIONS OF ARTIFICIAL INTELLIGENCE AND THE ELEVENTH SYMPOSIUM ON EDUCATIONAL ADVANCES IN ARTIFICIAL INTELLIGENCE, 2021, 35 : 259 - 267
  • [42] IMPROVING ACTOR-CRITIC REINFORCEMENT LEARNING VIA HAMILTONIAN MONTE CARLO METHOD
    Xu, Duo
    Fekri, Faramarz
    2022 IEEE INTERNATIONAL CONFERENCE ON ACOUSTICS, SPEECH AND SIGNAL PROCESSING (ICASSP), 2022, : 4018 - 4022
  • [43] Incentive Learning in Monte Carlo Tree Search
    Kao, Kuo-Yuan
    Wu, I-Chen
    Yen, Shi-Jim
    Shan, Yi-Chang
    IEEE TRANSACTIONS ON COMPUTATIONAL INTELLIGENCE AND AI IN GAMES, 2013, 5 (04) : 346 - 352
  • [44] Renewal Monte Carlo: Renewal Theory-Based Reinforcement Learning
    Subramanian, Jayakumar
    Mahajan, Aditya
    IEEE TRANSACTIONS ON AUTOMATIC CONTROL, 2020, 65 (08) : 3663 - 3670
  • [45] Design of a Block Go program using deep learning and Monte Carlo tree search
    Lin, Ching-Nung
    Chen, Jr-Chang
    Yen, Shi-Jim
    Chen, Chan-San
    ICGA JOURNAL, 2018, 40 (03) : 149 - 159
  • [46] Driving Maneuvers Prediction Based Autonomous Driving Control by Deep Monte Carlo Tree Search
    Chen, Jienan
    Zhang, Cong
    Luo, Jinting
    Xie, Junfei
    Wan, Yan
    IEEE TRANSACTIONS ON VEHICULAR TECHNOLOGY, 2020, 69 (07) : 7146 - 7158
  • [47] Task-Completion Dialogue Policy Learning via Monte Carlo Tree Search with Dueling Network
    Wang, Sihan
    Zhou, Kaijie
    Lai, Kunfeng
    Shen, Jianping
    PROCEEDINGS OF THE 2020 CONFERENCE ON EMPIRICAL METHODS IN NATURAL LANGUAGE PROCESSING (EMNLP), 2020, : 3461 - 3471
  • [48] Reinforcement learning, Sequential Monte Carlo and the EM algorithm
    VIVEK S BORKAR
    ANKUSH V JAIN
    Sādhanā, 2018, 43
  • [49] Learning to Dress: Synthesizing Human Dressing Motion via Deep Reinforcement Learning
    Clegg, Alexander
    Yu, Wenhao
    Tan, Jie
    Liu, C. Karen
    Turk, Greg
    ACM TRANSACTIONS ON GRAPHICS, 2018, 37 (06):
  • [50] Learning to Dress: Synthesizing Human Dressing Motion via Deep Reinforcement Learning
    Clegg, Alexander
    Yu, Wenhao
    Tan, Jie
    Liu, C. Karen
    Turk, Greg
    SIGGRAPH ASIA'18: SIGGRAPH ASIA 2018 TECHNICAL PAPERS, 2018,