Batch reinforcement learning with state importance

被引:0
作者
Li, LH [1 ]
Bulitko, V [1 ]
Greiner, R [1 ]
机构
[1] Univ Alberta, Dept Comp Sci, Edmonton, AB T6G 2E8, Canada
来源
MACHINE LEARNING: ECML 2004, PROCEEDINGS | 2004年 / 3201卷
关键词
D O I
暂无
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
We investigate the problem of using function approximation in rein forcement learning where the agent's policy is represented as a classifier mapping states to actions. High classification accuracy is usually deemed to correlate with high policy quality. But this is not necessarily the case as increasing classification accuracy can actually decrease the policy's quality. This phenomenon takes place when the learning process begins to focus on classifying less "important" states. In this paper, we introduce a measure of state's decision-making importance that can be used to improve policy learning. As a result, the focused learning process is shown to converge faster to better policies(1).
引用
收藏
页码:566 / 568
页数:3
相关论文
共 13 条
  • [1] BAIRD L, 1993, ADV UPDATING
  • [2] DIETTERICH TG, 2002, ADV NEURAL INFORMATI, V14
  • [3] Fan W., 1999, P 16 INT C MACH LEAR
  • [4] FERN A, 2004, ADV NEURAL INFORMATI, V16
  • [5] KEARNS M, 2000, ADV NEURAL INFORMATI, V12
  • [6] LAGOUDAKIS M, 2003, P 12 INT C MACH LEAR
  • [7] Langford John, 2003, P MACH LEARN RED WOR
  • [8] LEVNER I, 2004, P 12 INN APPL ART IN
  • [9] LI L, 2004, THESIS U ALBERTA EDM
  • [10] NG AY, 2000, P 16 C UNC AI