Quantum reinforcement learning via policy iteration

被引：0

作者：

El Amine Cherrat

Iordanis Kerenidis

Anupam Prakash

机构：

[1] Université de Paris,

[2] CNRS,undefined

[3] IRIF,undefined

来源：

Quantum Machine Intelligence | 2023年 / 5卷

关键词：

Quantum computing; Quantum machine learning; Reinforcement learning; Policy iteration;

D O I：

暂无

中图分类号：

学科分类号：

摘要：

Quantum computing has shown the potential to substantially speed up machine learning applications, in particular for supervised and unsupervised learning. Reinforcement learning, on the other hand, has become essential for solving many decision-making problems and policy iteration methods remain the foundation of such approaches. In this paper, we provide a general framework for performing quantum reinforcement learning via policy iteration. We validate our framework by designing and analyzing quantum policy evaluation methods for infinite-horizon discounted problems by building quantum states that approximately encode the value function of a policy, and quantum policy improvement methods by post-processing measurement outcomes on these quantum states. Last, we study the theoretical and experimental performance of our quantum algorithms on two environments from OpenAI’s Gym.

引用

共 50 条

[1] Quantum reinforcement learning via policy iteration
Cherrat, El Amine
Kerenidis, Iordanis
Prakash, Anupam
QUANTUM MACHINE INTELLIGENCE, 2023, 5 (02)
[2] Multiagent Reinforcement Learning: Rollout and Policy Iteration
Bertsekas, Dimitri
IEEE-CAA JOURNAL OF AUTOMATICA SINICA, 2021, 8 (02) : 249 - 272
[3] Least Square Policy Iteration in Reinforcement Learning
Zhang, Haifei
Deng, Hailong
Zhao, Bin
Hong, Ying
PROCEEDINGS OF THE INTERNATIONAL CONFERENCE ON LOGISTICS, ENGINEERING, MANAGEMENT AND COMPUTER SCIENCE (LEMCS 2015), 2015, 117 : 1365 - 1370
[4] Distributed randomized multiagent policy iteration in reinforcement learning
Zhang, Weipeng
RESULTS IN CONTROL AND OPTIMIZATION, 2023, 12
[5] A tutorial review of policy iteration methods in reinforcement learning for nonlinear optimal control
Wang, Yujia
Zhu, Xinji
Wu, Zhe
DIGITAL CHEMICAL ENGINEERING, 2025, 15
[6] Incremental least squares policy iteration in reinforcement learning for control
Li, Chun-Gui
Wang, Meng
Yang, Shu-Hong
PROCEEDINGS OF 2008 INTERNATIONAL CONFERENCE ON MACHINE LEARNING AND CYBERNETICS, VOLS 1-7, 2008, : 2010 - 2014
[7] Variational quantum reinforcement learning via evolutionary optimization
Chen, Samuel Yen-Chi
Huang, Chih-Min
Hsing, Chia-Wei
Goan, Hsi-Sheng
Kao, Ying-Jer
MACHINE LEARNING-SCIENCE AND TECHNOLOGY, 2022, 3 (01):
[8] Policy Iteration Reinforcement Learning-based control using a Grey Wolf Optimizer algorithm
Zamfirache, Iuliu Alexandru
Precup, Radu-Emil
Roman, Raul-Cristian
Petriu, Emil M.
INFORMATION SCIENCES, 2022, 585 : 162 - 175
[9] Preference-based reinforcement learning: a formal framework and a policy iteration algorithm
Fuernkranz, Johannes
Huellermeier, Eyke
Cheng, Weiwei
Park, Sang-Hyeun
MACHINE LEARNING, 2012, 89 (1-2) : 123 - 156
[10] Preference-based reinforcement learning: a formal framework and a policy iteration algorithm
Johannes Fürnkranz
Eyke Hüllermeier
Weiwei Cheng
Sang-Hyeun Park
Machine Learning, 2012, 89 : 123 - 156

← 1 2 3 4 5 →