Last-Position Elimination-Based Learning Automata

被引：53

作者：

Zhang, Junqi ^{[1
]}

Wang, Cheng ^{[1
]}

Zhou, MengChu ^{[1
,2
]}

机构：

[1] Tongji Univ, Minist Educ, Dept Comp Sci & Technol, Key Lab Embedded Syst & Serv Comp, Shanghai 200092, Peoples R China

[2] New Jersey Inst Technol, Dept Elect & Comp Engn, Newark, NJ 07102 USA

来源：

IEEE TRANSACTIONS ON CYBERNETICS | 2014年 / 44卷 / 12期

基金：

美国国家科学基金会;

关键词：

Last-position elimination; learning automata (LA); stationary environments; update scheme; SCHEMES; ALGORITHM; DESIGN;

D O I：

10.1109/TCYB.2014.2309478

中图分类号：

TP [自动化技术、计算机技术];

学科分类号：

0812 ;

摘要：

An update scheme of the state probability vector of actions is critical for learning automata (LA). The most popular is the pursuit scheme that pursues the estimated optimal action and penalizes others. This paper proposes a reverse philosophy that leads to last-position elimination-based learning automata (LELA). The action graded last in terms of the estimated performance is penalized by decreasing its state probability and is eliminated when its state probability becomes zero. All active actions, that is, actions with nonzero state probability, equally share the penalized state probability from the last-position action at each iteration. The proposed LELA is characterized by the relaxed convergence condition for the optimal action, the accelerated step size of the state probability update scheme for the estimated optimal action, and the enriched sampling for the estimated nonoptimal actions. The proof of the epsilon-optimal property for the proposed algorithm is presented. Last-position elimination is a widespread philosophy in the real world and has proved to be also helpful for the update scheme of the learning automaton via the simulations of well-known benchmark environments. In the simulations, two versions of the LELA, using different selection strategies of the last action, are compared with the classical pursuit algorithms Discretized Pursuit Reward-Inaction (DPRI) and Discretized Generalized Pursuit Algorithm (DGPA). Simulation results show that the proposed schemes achieve significantly faster convergence and higher accuracy than the classical ones. Specifically, the proposed schemes reduce the interval to find the best parameter for a specific environment in the classical pursuit algorithms. Thus, they can have their parameter tuning easier to perform and can save much more time when applied to a practical case. Furthermore, the convergence curves and the corresponding variance coefficient curves of the contenders are illustrated to characterize their essential differences and verify the analysis results of the proposed algorithms.

引用

页码：2484 / 2492

页数：9

共 57 条

[1] Generalized pursuit learning schemes: New families of continuous and discretized learning automata [J].

Agache, M ;

Oommen, BJ .

IEEE TRANSACTIONS ON SYSTEMS MAN AND CYBERNETICS PART B-CYBERNETICS, 2002, 32 (06) :738-749

[2]

[Anonymous], 1997, Learning automata and stochastic optimization

[3]

[Anonymous], PLAT JUB C SYST SIGN

[4] Cellular Learning Automata With Multiple Learning Automata in Each Cell and Its Applications [J].

Beigy, Hamid ;

Meybodi, Mohammad Reza .

IEEE TRANSACTIONS ON SYSTEMS MAN AND CYBERNETICS PART B-CYBERNETICS, 2010, 40 (01) :54-65

[5] Learning automata based dynamic guard channel algorithms [J].

Beigy, Harnid ;

Meybodi, M. R. .

COMPUTERS & ELECTRICAL ENGINEERING, 2011, 37 (04) :601-613

[6] Recursive learning automata approach to Markov decision processes [J].

Chang, Hyeong Soo ;

Fu, Michael C. ;

Hu, Jiaqiao ;

Marcus, Steven I. .

IEEE TRANSACTIONS ON AUTOMATIC CONTROL, 2007, 52 (07) :1349-1355

[7] Circle detection on images using learning automata [J].

Cuevas, E. ;

Wario, F. ;

Zaldivar, D. ;

Perez-Cisneros, M. .

IET COMPUTER VISION, 2012, 6 (02) :121-132

[8] Seeking multi-thresholds for image segmentation with Learning Automata [J].

Cuevas, Erik ;

Zaldivar, Daniel ;

Perez-Cisneros, Marco .

MACHINE VISION AND APPLICATIONS, 2011, 22 (05) :805-818

[9]

Dong W., 2014, IEEE T NEURAL NETW L

[10] A multi-level approach for network design of integrated supply chains [J].

Dotoli, M ;

Fanti, MP ;

Meloni, C ;

Zhou, MC .

INTERNATIONAL JOURNAL OF PRODUCTION RESEARCH, 2005, 43 (20) :4267-4287

← 1 2 3 4 5 6 →