Reinforcement learning and design of nonparametric sequential decision networks

被引：1

作者：

Ertin, E ^{[1
]}

Priddy, KL ^{[1
]}

机构：

[1] Battelle Mem Inst, Cognit Syst Grp, Columbus, OH 43201 USA

来源：

APPLICATIONS AND SCIENCE OF COMPUTATIONAL INTELLIGENCE V | 2002年 / 4739卷

关键词：

dynamic programming; sequential detection; neural networks; reinforcement learning;

D O I：

10.1117/12.458718

中图分类号：

TP18 [人工智能理论];

学科分类号：

081104 ; 0812 ; 0835 ; 1405 ;

摘要：

Ill this paper we discuss the design of sequential detection networks for nonparametric sequential analysis. We present. a general probabilistic model for sequential detection problems where the sample size as well as the statistics of the sample can be varied. A general sequential detection network handles three decisions. First, the network decides whether to continue sampling or stop and make a filial decision. Second, in the case of continued sampling the network chooses the source for the next sample. Third, once the sampling is concluded the network makes the final classification decision. We present a Q-learning method to train sequential detection networks through reinforcement learning and cross-entropy minimization on labeled data. As a special case we obtain networks that approximate the optimal parametric sequential probability ratio test. The performance of the proposed detection networks is compared to optimal tests using simulations.

引用

页码：40 / 47

页数：8

共 23 条

[1] NEURONLIKE ADAPTIVE ELEMENTS THAT CAN SOLVE DIFFICULT LEARNING CONTROL-PROBLEMS
BARTO, AG
SUTTON, RS
ANDERSON, CW
[J]. IEEE TRANSACTIONS ON SYSTEMS MAN AND CYBERNETICS, 1983, 13 (05): : 834 - 846
[2] A MARKOVIAN DECISION PROCESS
BELLMAN, R
[J]. JOURNAL OF MATHEMATICS AND MECHANICS, 1957, 6 (05): : 679 - 684
[3] DYNAMIC PROGRAMMING
BELLMAN, R
[J]. SCIENCE, 1966, 153 (3731) : 34 - &
[4] BERGER J. O., 2013, Statistical Decision Theory and Bayesian Analysis, DOI [10.1007/978-1-4757-4286-2, DOI 10.1007/978-1-4757-4286-2]
[5] Bertsekas D. P., 1987, DYNAMIC PROGRAMMING
[6] Bertsekas D. P., 1996, Neuro Dynamic Programming, V1st
[7] Bertsekas Dimitri P., 1989, PARALLEL DISTRIBUTED
[8] DISTRIBUTED DYNAMIC-PROGRAMMING
BERTSEKAS, DP
[J]. IEEE TRANSACTIONS ON AUTOMATIC CONTROL, 1982, 27 (03) : 610 - 616
[9] Bishop C. M., 1995, NEURAL NETWORKS PATT
[10] FERGUSON TS, 1967, MATH STAT DECISIONS

← 1 2 3 →