Quantum reinforcement learning via policy iteration

被引：0

作者：

El Amine Cherrat

Iordanis Kerenidis

Anupam Prakash

机构：

[1] Université de Paris,

[2] CNRS,undefined

[3] IRIF,undefined

来源：

Quantum Machine Intelligence | 2023年 / 5卷

关键词：

Quantum computing; Quantum machine learning; Reinforcement learning; Policy iteration;

D O I：

暂无

中图分类号：

学科分类号：

摘要：

Quantum computing has shown the potential to substantially speed up machine learning applications, in particular for supervised and unsupervised learning. Reinforcement learning, on the other hand, has become essential for solving many decision-making problems and policy iteration methods remain the foundation of such approaches. In this paper, we provide a general framework for performing quantum reinforcement learning via policy iteration. We validate our framework by designing and analyzing quantum policy evaluation methods for infinite-horizon discounted problems by building quantum states that approximately encode the value function of a policy, and quantum policy improvement methods by post-processing measurement outcomes on these quantum states. Last, we study the theoretical and experimental performance of our quantum algorithms on two environments from OpenAI’s Gym.

引用

共 50 条

[11] Preference-based reinforcement learning: a formal framework and a policy iteration algorithm
Johannes Fürnkranz
Eyke Hüllermeier
Weiwei Cheng
Sang-Hyeun Park
[J]. Machine Learning, 2012, 89 : 123 - 156
[12] Robust Control of An Inverted Pendulum System Based on Policy Iteration in Reinforcement Learning
Ma, Yan
Xu, Dengguo
Huang, Jiashun
Li, Yahui
[J]. APPLIED SCIENCES-BASEL, 2023, 13 (24):
[13] Off-Policy Deep Reinforcement Learning Based on Steffensen Value Iteration
Cheng, Yuhu
Chen, Lin
Chen, C. L. Philip
Wang, Xuesong
[J]. IEEE TRANSACTIONS ON COGNITIVE AND DEVELOPMENTAL SYSTEMS, 2021, 13 (04) : 1023 - 1032
[14] Dual policy iteration-reinforcement learning to optimize the detection quality of passive remote sensing device
Guo, Rui
Fu, Zhonghao
[J]. SIGNAL PROCESSING, 2023, 209
[15] Q-LEARNING, POLICY ITERATION AND ACTOR-CRITIC REINFORCEMENT LEARNING COMBINED WITH METAHEURISTIC ALGORITHMS IN SERVO SYSTEM CONTROL
Zamfirache, Iuliu Alexandru
Precup, Radu-Emil
Petriu, Emil M.
[J]. FACTA UNIVERSITATIS-SERIES MECHANICAL ENGINEERING, 2023, 21 (04) : 615 - 630
[16] A reinforcement learning algorithm based on policy iteration for average reward: Empirical results with yield management and convergence analysis
Gosavi, A
[J]. MACHINE LEARNING, 2004, 55 (01) : 5 - 29
[17] Robotic Depalletizing via Reinforcement Learning of a Pushing Policy
Dimou, Argyris
Kiatos, Marios
Malassiotis, Sotiris
[J]. SUPPLY CHAINS, PT I, ICSC 2024, 2025, 2110 : 105 - 117
[18] A Reinforcement Learning Algorithm Based on Policy Iteration for Average Reward: Empirical Results with Yield Management and Convergence Analysis
Abhijit Gosavi
[J]. Machine Learning, 2004, 55 : 5 - 29
[19] Global Convergence of Localized Policy Iteration in Networked Multi-Agent Reinforcement Learning
Zhang Y.
Qu G.
Xu P.
Lin Y.
Chen Z.
Wierman A.
[J]. Performance Evaluation Review, 2023, 51 (01): : 83 - 84
[20] Efficient relation extraction via quantum reinforcement learning
Zhu, Xianchao
Mu, Yashuang
Wang, Xuetao
Zhu, William
[J]. COMPLEX & INTELLIGENT SYSTEMS, 2024, 10 (03) : 4009 - 4018

← 1 2 3 4 5 →