Towards Multi-agent Reinforcement Learning using Quantum Boltzmann Machines

被引:0
作者
Mueller, Tobias [1 ]
Roch, Christoph [1 ]
Schmid, Kyrill [1 ]
Altmann, Philipp [1 ]
机构
[1] Ludwig Maximilians Univ Munchen, Mobile & Distributed Syst Grp, Munich, Germany
来源
ICAART: PROCEEDINGS OF THE 14TH INTERNATIONAL CONFERENCE ON AGENTS AND ARTIFICIAL INTELLIGENCE - VOL 1 | 2022年
关键词
Multi-agent; Reinforcement Learning; D-Wave; Boltzmann Machines; Quantum Annealing; Quantum Artificial Intelligence;
D O I
10.5220/0010762100003116
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
Reinforcement learning has driven impressive advances in machine learning. Simultaneously, quantum-enhanced machine learning algorithms using quantum annealing underlie heavy developments. Recently, a multi-agent reinforcement learning (MARL) architecture combining both paradigms has been proposed. This novel algorithm, which utilizes Quantum Boltzmann Machines (QBMs) for Q-value approximation has outperformed regular deep reinforcement learning in terms of time-steps needed to converge. However, this algorithm was restricted to single-agent and small 2x2 multi-agent grid domains. In this work, we propose an extension to the original concept in order to solve more challenging problems. Similar to classic DQNs, we add an experience replay buffer and use different networks for approximating the target and policy values. The experimental results show that learning becomes more stable and enables agents to find optimal policies in grid-domains with higher complexity. Additionally, we assess how parameter sharing influences the agents' behavior in multi-agent domains. Quantum sampling proves to be a promising method for reinforcement learning tasks, but is currently limited by the Quantum Processing Unit (QPU) size and therefore by the size of the input and Boltzmann machine.
引用
收藏
页码:121 / 130
页数:10
相关论文
共 33 条
[1]  
ACKLEY DH, 1985, COGNITIVE SCI, V9, P147
[2]   Quantum-assisted Helmholtz machines: A quantum-classical deep learning framework for industrial datasets in near-term devices [J].
Benedetti, Marcello ;
Realpe-Gomez, John ;
Perdomo-Ortiz, Alejandro .
QUANTUM SCIENCE AND TECHNOLOGY, 2018, 3 (03)
[3]   Quantum machine learning [J].
Biamonte, Jacob ;
Wittek, Peter ;
Pancotti, Nicola ;
Rebentrost, Patrick ;
Wiebe, Nathan ;
Lloyd, Seth .
NATURE, 2017, 549 (7671) :195-202
[4]  
Binmore K., 2007, Game Theory: A Very Short Introduction
[5]  
Charpentier Arthur., 2020, Reinforcement learning in economics and finance
[6]  
Crawford Daniel, 2019, Reinforcement learning using quantum boltzmann machines
[7]   Quantum Speedup for Active Learning Agents [J].
Davide Paparo, Giuseppe ;
Dunjko, Vedran ;
Makmal, Adi ;
Angel Martin-Delgado, Miguel ;
Briegel, Hans J. .
PHYSICAL REVIEW X, 2014, 4 (03)
[8]  
Foerster JN, 2016, ADV NEUR IN, V29
[9]  
Foerster JN, 2018, AAAI CONF ARTIF INTE, P2974
[10]  
Jerbi S, 2020, QUANTUM ENHANCEMENTS