Reinforcement learning and design of nonparametric sequential decision networks

被引:1
作者
Ertin, E [1 ]
Priddy, KL [1 ]
机构
[1] Battelle Mem Inst, Cognit Syst Grp, Columbus, OH 43201 USA
来源
APPLICATIONS AND SCIENCE OF COMPUTATIONAL INTELLIGENCE V | 2002年 / 4739卷
关键词
dynamic programming; sequential detection; neural networks; reinforcement learning;
D O I
10.1117/12.458718
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
Ill this paper we discuss the design of sequential detection networks for nonparametric sequential analysis. We present. a general probabilistic model for sequential detection problems where the sample size as well as the statistics of the sample can be varied. A general sequential detection network handles three decisions. First, the network decides whether to continue sampling or stop and make a filial decision. Second, in the case of continued sampling the network chooses the source for the next sample. Third, once the sampling is concluded the network makes the final classification decision. We present a Q-learning method to train sequential detection networks through reinforcement learning and cross-entropy minimization on labeled data. As a special case we obtain networks that approximate the optimal parametric sequential probability ratio test. The performance of the proposed detection networks is compared to optimal tests using simulations.
引用
收藏
页码:40 / 47
页数:8
相关论文
共 23 条