Playing Atari with Hybrid Quantum-Classical Reinforcement Learning

被引:0
作者
Lockwood, Owen [1 ]
Si, Mei [2 ]
机构
[1] Rensselaer Polytech Inst, Dept Comp Sci, Troy, NY 12180 USA
[2] Rensselaer Polytech Inst, Dept Cognit Sci, Troy, NY USA
来源
NEURIPS 2020 WORKSHOP ON PRE-REGISTRATION IN MACHINE LEARNING, VOL 148 | 2020年 / 148卷
关键词
Pre-registration; Reinforcement Learning; Quantum Machine Learning; CIRCUITS; LEVEL;
D O I
暂无
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
Despite the successes of recent works in quantum reinforcement learning, there are still severe limitations on its applications due to the challenge of encoding large observation spaces into quantum systems. To address this challenge, we propose using a neural network as a data encoder, with the Atari games as our testbed. Specifically, the neural network converts the pixel input from the games to quantum data for a Quantum Variational Circuit (QVC); this hybrid model is then used as a function approximator in the Double Deep Q Networks algorithm. We explore a number of variations of this algorithm and find that our proposed hybrid models do not achieve meaningful results on two Atari games - Breakout and Pong. We suspect this is due to the significantly reduced sizes of the hybrid quantumclassical systems.
引用
收藏
页码:285 / 301
页数:17
相关论文
共 53 条
[1]  
Andrychowicz M., 2021, INT C LEARN REPR
[2]   Quantum circuits with many photons on a programmable nanophotonic chip [J].
Arrazola, J. M. ;
Bergholm, V ;
Bradler, K. ;
Bromley, T. R. ;
Collins, M. J. ;
Dhand, I ;
Fumagalli, A. ;
Gerrits, T. ;
Goussev, A. ;
Helt, L. G. ;
Hundal, J. ;
Isacsson, T. ;
Israel, R. B. ;
Izaac, J. ;
Jahangiri, S. ;
Janik, R. ;
Killoran, N. ;
Kumar, S. P. ;
Lavoie, J. ;
Lita, A. E. ;
Mahler, D. H. ;
Menotti, M. ;
Morrison, B. ;
Nam, S. W. ;
Neuhaus, L. ;
Qi, H. Y. ;
Quesada, N. ;
Repingon, A. ;
Sabapathy, K. K. ;
Schuld, M. ;
Su, D. ;
Swinarton, J. ;
Szava, A. ;
Tan, K. ;
Tan, P. ;
Vaidya, V. D. ;
Vernon, Z. ;
Zabaneh, Z. ;
Zhang, Y. .
NATURE, 2021, 591 (7848) :54-+
[3]   Quantum supremacy using a programmable superconducting processor [J].
Arute, Frank ;
Arya, Kunal ;
Babbush, Ryan ;
Bacon, Dave ;
Bardin, Joseph C. ;
Barends, Rami ;
Biswas, Rupak ;
Boixo, Sergio ;
Brandao, Fernando G. S. L. ;
Buell, David A. ;
Burkett, Brian ;
Chen, Yu ;
Chen, Zijun ;
Chiaro, Ben ;
Collins, Roberto ;
Courtney, William ;
Dunsworth, Andrew ;
Farhi, Edward ;
Foxen, Brooks ;
Fowler, Austin ;
Gidney, Craig ;
Giustina, Marissa ;
Graff, Rob ;
Guerin, Keith ;
Habegger, Steve ;
Harrigan, Matthew P. ;
Hartmann, Michael J. ;
Ho, Alan ;
Hoffmann, Markus ;
Huang, Trent ;
Humble, Travis S. ;
Isakov, Sergei V. ;
Jeffrey, Evan ;
Jiang, Zhang ;
Kafri, Dvir ;
Kechedzhi, Kostyantyn ;
Kelly, Julian ;
Klimov, Paul V. ;
Knysh, Sergey ;
Korotkov, Alexander ;
Kostritsa, Fedor ;
Landhuis, David ;
Lindmark, Mike ;
Lucero, Erik ;
Lyakh, Dmitry ;
Mandra, Salvatore ;
McClean, Jarrod R. ;
McEwen, Matthew ;
Megrant, Anthony ;
Mi, Xiao .
NATURE, 2019, 574 (7779) :505-+
[4]   Autonomous navigation of stratospheric balloons using reinforcement learning [J].
Bellemare, Marc G. ;
Candido, Salvatore ;
Castro, Pablo Samuel ;
Gong, Jun ;
Machado, Marlos C. ;
Moitra, Subhodeep ;
Ponda, Sameera S. ;
Wang, Ziyu .
NATURE, 2020, 588 (7836) :77-+
[5]   Parameterized quantum circuits as machine learning models [J].
Benedetti, Marcello ;
Lloyd, Erika ;
Sack, Stefan ;
Fiorentini, Mattia .
QUANTUM SCIENCE AND TECHNOLOGY, 2019, 4 (04)
[6]   Quantum machine learning [J].
Biamonte, Jacob ;
Wittek, Peter ;
Pancotti, Nicola ;
Rebentrost, Patrick ;
Wiebe, Nathan ;
Lloyd, Seth .
NATURE, 2017, 549 (7671) :195-202
[7]  
Brockman G, 2016, Arxiv, DOI arXiv:1606.01540
[8]  
Broughton M, 2021, Arxiv, DOI arXiv:2003.02989
[9]   Variational Quantum Circuits for Deep Reinforcement Learning [J].
Chen, Samuel Yen-Chi ;
Yang, Chao-Han Huck ;
Qi, Jun ;
Chen, Pin-Yu ;
Ma, Xiaoli ;
Goan, Hsi-Sheng .
IEEE ACCESS, 2020, 8 (08) :141007-141024
[10]   Quantum convolutional neural networks [J].
Cong, Iris ;
Choi, Soonwon ;
Lukin, Mikhail D. .
NATURE PHYSICS, 2019, 15 (12) :1273-+