DISCRETIZED PURSUIT LEARNING AUTOMATA

被引:99
作者
OOMMEN, BJ
LANCTOT, JK
机构
[1] School of Computer Science, Carleton University, Ottawa, ON
来源
IEEE TRANSACTIONS ON SYSTEMS MAN AND CYBERNETICS | 1990年 / 20卷 / 04期
关键词
Automata Theory - Computer Simulation - Probability;
D O I
10.1109/21.105092
中图分类号
TP3 [计算技术、计算机技术];
学科分类号
0812 ;
摘要
The problem of a stochastic learning automation interacting with an unknown random environment is considered. The fundamental problem is that of learning, through interaction, the best action allowed by the environment O.e., the action that is rewarded optimally). By using running estimates of reward probabilities to learn the optimal action, an extremely efficient pursuit algorithm (PA) was reported in earlier works, which is presently among the fastest algorithms known. The improvements gained by rendering the PA discrete is investigated. This is done by restricting the probability of selecting an action to a finite, and hence, discrete subset of [0,1]. This improved scheme is proven to be ∈ -optimal in all stationary environments. Furthermore, the experimental results seem to indicate that the algorithm presented in the paper is faster than the fastest “nonestimator” learning automata reported to date and also faster than the continuous pursuit automaton pursuit algorithm is also presented. © 1990 IEEE
引用
收藏
页码:931 / 938
页数:8
相关论文
共 33 条
[1]   AN APPLICATION OF THE STOCHASTIC AUTOMATON TO THE INVESTMENT GAME [J].
BABA, N ;
SOEDA, T ;
SHOMAN, T ;
SAWARAGI, Y .
INTERNATIONAL JOURNAL OF SYSTEMS SCIENCE, 1980, 11 (12) :1447-1457
[2]  
Flerov Y. A., 1972, Journal of Cybernetics, V2, P112, DOI 10.1080/01969727208542916
[3]  
Isaacson DL, 1976, MARKOV CHAINS THEORY
[4]  
KARLIN S, 1974, 1ST COURSE STOCHASTI
[5]  
LAKSHMIVARAHAN S, 1973, IEEE T SYST MAN CYB, VSMC3, P281
[6]  
LAKSHMIVARAHAN S, 1981, APPL MATH COMPUT, V8, P51, DOI 10.1016/0096-3003(81)90035-7
[7]  
Lakshmivarahan S, 1981, LEARNING ALGORITHMS
[8]  
LAKSHMIVARAHAN S, 1979, EECS7901 U OKL SCH E
[9]  
LANCTOT JK, 1989, THESIS CARLETON U OT
[10]  
MEYBODI MR, THESIS U OKLAHOMA NO