AUTOMATIC STATE SPACE AGGREGATION USING A DENSITY BASED TECHNIQUE

被引：0

作者：

Loscalzo, Steven ^{[1
,2
]}

Wright, Robert ^{[2
]}

机构：

[1] SUNY Binghamton, Dept Comp Sci, Binghamton, NY 13902 USA

[2] Air Force Res Lab, Informat Directorate, Rome, NY USA

来源：

ICAART 2011: PROCEEDINGS OF THE 3RD INTERNATIONAL CONFERENCE ON AGENTS AND ARTIFICIAL INTELLIGENCE, VOL 1 | 2011年

关键词：

State space abstraction; Reinforcement learning;

D O I：

暂无

中图分类号：

TP18 [人工智能理论];

学科分类号：

081104 ; 0812 ; 0835 ; 1405 ;

摘要：

Applying reinforcement learning techniques in continuous environments is challenging because there are infinitely many states to visit in order to learn an optimal policy. To make this situation tractable, abstractions are often used to reduce the infinite state space down to a small and finite one. Some of the more powerful and commonplace abstractions, tiling abstractions such as CMAC, work by aggregating many base states into a single abstract state. Unfortunately, significant manual effort is often necessary in order to apply them to non-trivial control problems. Here we develop an automatic state space aggregation algorithm, Maximum Density Separation, which can produce a meaningful abstraction with minimal manual effort. This method leverages the density of observations in the space to construct a partition and aggregate states in a dense region to the same abstract state. We show that the abstractions produced by this method on two benchmark reinforcement learning problems can outperform fixed tiling methods in terms of both the convergence rate of a learning algorithm and the number of abstract states needed.

引用

页码：249 / 256

页数：8

共 14 条

[1]

[Anonymous], 1997, MACHINE LEARNING, MCGRAW-HILL SCIENCE/ENGINEERING/MATH

[2]

[Anonymous], 1996, THESIS

[3]

Boyan J. A., 1995, Advances in Neural Information Processing Systems 7, P369

[4] Mean shift: A robust approach toward feature space analysis [J].

Comaniciu, D ;

Meer, P .

IEEE TRANSACTIONS ON PATTERN ANALYSIS AND MACHINE INTELLIGENCE, 2002, 24 (05) :603-619

[5]

Gomez F, 2006, LECT NOTES COMPUT SC, V4212, P654

[6]

Gomez FJ, 1999, IJCAI-99: PROCEEDINGS OF THE SIXTEENTH INTERNATIONAL JOINT CONFERENCE ON ARTIFICIAL INTELLIGENCE, VOLS 1 & 2, P1356

[7]

JAMES D, 2004, P 2004 C GEN EV COMP

[8]

Li L., 2006, ISAIM

[9] CMAC - AN ASSOCIATIVE NEURAL NETWORK ALTERNATIVE TO BACKPROPAGATION [J].

MILLER, WT ;

GLANZ, FH ;

KRAFT, LG .

PROCEEDINGS OF THE IEEE, 1990, 78 (10) :1561-1567

[10]

Stanley K O, 2002, Proceedings of the 4th Annual Conference on Genetic and Evolutionary Computation, P569

← 1 2 →