Density estimation and random variate generation using multilayer networks

被引:28
作者
Magdon-Ismail, M [1 ]
Atiya, A
机构
[1] Rensselaer Polytech Inst, Dept Comp Sci, Troy, NY 12180 USA
[2] Countrywide Capital Markets, Calbasas, CA 91302 USA
来源
IEEE TRANSACTIONS ON NEURAL NETWORKS | 2002年 / 13卷 / 03期
关键词
convergence rate; density estimation; distribution function; estimation error; multilayer network; neural network; random number generation; stochastic algorithms;
D O I
10.1109/TNN.2002.1000120
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
In this paper we consider two important topics: density estimation and random variate generation. We will present a framework that is easily implemented using the familiar multilayer neural network. First, we develop two new methods for density estimation, a stochastic method and a related deterministic method. Both methods are based on approximating the distribution function, the density being obtained by differentiation. In the second part of the paper, we develop new random number generation methods. Our methods do not suffer from some of the restrictions of existing methods in that they can be used to generate numbers from any density provided that certain smoothness conditions are satisfied. One of our methods is based on an observed inverse relationship between the density estimation process and random number generation. We present two variants of this method, a stochastic, and a deterministic version. We propose a second method that is based on a novel control formulation of the problem, where a "controller network" is trained to shape a given density into the desired density. We justify the use of all the methods that we propose by providing theoretical convergence results. In particular, we prove that the L-infinity convergence to the true density for both the density estimation and random variate generation techniques occurs at a rate 0 ((log log N/N)((1-epsilon))(/2)) where N is the number of data points and e can be made arbitrarily small for sufficiently smooth target densities. This bound is very close to the optimally achievable convergence rate under similar smoothness conditions. Also, for comparison, the L-2 root mean square (rms) convergence rate of a positive kernel density estimator is 0 (N-2/5) when the optimal kernel width is used. We present numerical simulations to illustrate the performance of the proposed density estimation and random variate generation methods. In addition, we present an extended introduction and bibliography that serves as an overview and reference for the practitioner.
引用
收藏
页码:497 / 520
页数:24
相关论文
共 69 条
[1]  
ABUMOSTAFA Y, 1995, NEURAL COMPUT, V4, P639
[2]   Conditional distribution learning with neural networks and its application to channel equalization [J].
Adali, T ;
Liu, X ;
Sonmez, MK .
IEEE TRANSACTIONS ON SIGNAL PROCESSING, 1997, 45 (04) :1051-1064
[3]  
Adams A, 2003, SOBOLEV SPACES
[4]  
[Anonymous], ANN STAT
[5]  
[Anonymous], 1994, NCRG4288
[6]   How initial conditions affect generalization performance in large networks [J].
Atiya, A ;
Ji, CY .
IEEE TRANSACTIONS ON NEURAL NETWORKS, 1997, 8 (02) :448-451
[7]  
ATIYA A, 1994, P IEEE WORLD C COMP
[8]   DISTRIBUTION ESTIMATION CONSISTENT IN TOTAL VARIATION AND IN 2 TYPES OF INFORMATION DIVERGENCE [J].
BARRON, AR ;
GYORFI, L ;
VANDERMEULEN, EC .
IEEE TRANSACTIONS ON INFORMATION THEORY, 1992, 38 (05) :1437-1454
[9]   MINIMUM COMPLEXITY DENSITY-ESTIMATION [J].
BARRON, AR ;
COVER, TM .
IEEE TRANSACTIONS ON INFORMATION THEORY, 1991, 37 (04) :1034-1054
[10]   APPROXIMATION OF DENSITY-FUNCTIONS BY SEQUENCES OF EXPONENTIAL-FAMILIES [J].
BARRON, AR ;
SHEU, CH .
ANNALS OF STATISTICS, 1991, 19 (03) :1347-1369