Learning classifier systems with memory condition to solve non-Markov problems

被引:0
作者
Zhaoxiang Zang
Dehua Li
Junying Wang
机构
[1] China Three Gorges University,College of Computer and Information Technology
[2] Huazhong University of Science and Technology,Institute for Pattern Recognition and Artificial Intelligence
来源
Soft Computing | 2015年 / 19卷
关键词
Learning classifier system; XCS; Memory condition; Aliasing state detection; Partially observable environments; Non-Markov problems;
D O I
暂无
中图分类号
学科分类号
摘要
In the family of learning classifier systems, the classifier system XCS has been successfully used for many applications. However, the standard XCS has no memory mechanism and can only learn optimal policy in Markov environments, but fails in non-Markov ones. In this work, we aim to develop a new classifier system based on XCS to tackle this problem. It adds a memory list with numbered slots to XCS to record input sensation history, and extends only a small number of classifiers with memory conditions. The classifier’s memory condition, as a foothold to disambiguate non-Markov states, is used to sense a specified element in the memory list, which makes our system can “jump over” irrelevant or confusing states to get decisive prior information that may be far back in time. Besides, a detection method is employed to recognize non-Markov states in environments, to avoid these states controlling over classifiers’ memory conditions. Furthermore, four sets of different complex maze environments have been tested by the proposed method. Experimental results show that our system can overcome the overhead problem often encountered in history-window approaches, and is an effective technique to solve non-Markov environments.
引用
收藏
页码:1679 / 1699
页数:20
相关论文
共 40 条
  • [1] Butz MV(2002)An algorithmic description of XCS Soft Comput 6 144-153
  • [2] Wilson SW(1994)Adding temporary memory to ZCS Adapt Behav 3 101-150
  • [3] Cliff D(2008)Reinforcement learning for POMDP using state classification Appl Artif Intell 22 761-779
  • [4] Ross S(2009)A recursive classifier system for partially observable environments Fundamenta Informaticae 97 15-40
  • [5] Dung LT(2008)A new architecture for learning classifier systems to solve POMDP problems Fundamenta Informaticae 84 329-351
  • [6] Komeda T(1977)Cognitive systems based on adaptive algorithms ACM SIGART Bull 63 49-285
  • [7] Takagi M(1996)Reinforcement learning: a survey J Artif Intell Res 4 237-4500
  • [8] Hamzeh A(2008)A comparison between ATNoSFERES and learning classifier systems on non-Markov problems Inf Sci 178 4482-149
  • [9] Hashemi S(1999)An analysis of generalization in the XCS classifier system Evol Comput 7 125-170
  • [10] Sami A(2002)Learning classifier systems from a reinforcement learning perspective Soft Comput 6 162-418