Generalized Reinforcement Learning with Concept-Driven Abstract Actions

被引:0
作者
Chiu, Po-Hsiang [1 ]
Huber, Manfred [1 ]
机构
[1] Univ Texas Arlington, Dept Comp Sci & Engn, Arlington, TX 76019 USA
来源
2011 IEEE INTERNATIONAL CONFERENCE ON SYSTEMS, MAN, AND CYBERNETICS (SMC) | 2011年
关键词
kernel methods; Gaussian process; parametric actions; reinforcement learning; spectral clustering;
D O I
暂无
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
The standard reinforcement learning framework often faces challenges in a varying or evolving environment due to an inherent limitation in its representation. In particular, useful actions for decision making are often assumed to be a prefixed set prior to the learning process. Consequently, the derived policy in general lacks the ability to adapt to possible variations in the action outcomes or the action set itself without resorting to a substantial re-learning process. In addition, complexity in the state space modeling is often a bottleneck for standard learning methods. This paper proposes a new framework of reinforcement learning that enables the agent to formulate an action-oriented conceptual model while deriving the decision policy simultaneously. The new framework, Concept-Driven Learning Architecture (CDLA), formulates the abstract actions based on associating the correlated past decision history. Specifically, the kernel function, Gaussian process and spectral clustering mechanisms are combined into a functional clustering method to identify a set of coherent, concept-driven abstract actions using which the agent derives a control policy.
引用
收藏
页码:2575 / 2582
页数:8
相关论文
共 13 条
[1]  
[Anonymous], 2004, KERNEL METHODS PATTE
[2]  
[Anonymous], 2020, Reinforcement Learning, An Introduction
[3]  
[Anonymous], 2005, P 22 INT C MACH LEAR, DOI DOI 10.1145/1102351.1102421
[4]  
Asadi M., ICML 2005 BONN GERM
[5]  
Chiu P-H, 2011, INT C SYST IN PRESS
[6]   Decentralized control of cooperative systems: Categorization and complexity analysis [J].
Goldman, CV ;
Zilberstein, S .
JOURNAL OF ARTIFICIAL INTELLIGENCE RESEARCH, 2004, 22 :143-174
[7]   Efficient solution algorithms for factored MDPs [J].
Guestrin, C ;
Koller, D ;
Parr, R ;
Venkataraman, S .
JOURNAL OF ARTIFICIAL INTELLIGENCE RESEARCH, 2003, 19 :399-468
[8]  
Mannor S., 2004, Proceedings of the twenty-first international conference on Machine learning, page, P71
[9]   Learning to Generalize and Reuse Skills Using Approximate Partial Policy Homomorphisms [J].
Rajendran, Srividhya ;
Huber, Manfred .
2009 IEEE INTERNATIONAL CONFERENCE ON SYSTEMS, MAN AND CYBERNETICS (SMC 2009), VOLS 1-9, 2009, :2239-2244
[10]  
Ramsay J., 2005, FUNCTIONAL DATA ANAL, VSecond