Combining regular and irregular histograms by penalized likelihood

被引:12
作者
Rozenholc, Yves [2 ]
Mildenberger, Thoralf [1 ]
Gather, Ursula [1 ]
机构
[1] Tech Univ Dortmund, Fak Stat, D-44221 Dortmund, Germany
[2] Univ Paris 05, UFR Math & Informat, MAP5, UMR CNRS 8145, F-75270 Paris, France
关键词
Irregular histogram; Density estimation; Penalized likelihood; Dynamic programming; DENSITY-ESTIMATION; MULTIVARIATE HISTOGRAMS; SELECTION;
D O I
10.1016/j.csda.2010.04.021
中图分类号
TP39 [计算机的应用];
学科分类号
081203 ; 0835 ;
摘要
A new fully automatic procedure for the construction of histograms is proposed. It consists of constructing both a regular and an irregular histogram and then choosing between the two. To choose the number of bins in the irregular histogram, two different penalties motivated by recent work in model selection are proposed. A description of the algorithm and a proper tuning of the penalties is given. Finally, different versions of the procedure are compared to other existing proposals for a wide range of densities and sample sizes. In the simulations, the squared Hellinger risk of the new procedure is always at most twice as large as the risk of the best of the other methods. The procedure is implemented in the R-Package histogram available from CRAN.(1) (C) 2010 Elsevier B.V. All rights reserved.
引用
收藏
页码:3313 / 3323
页数:11
相关论文
共 32 条
[1]   NEW LOOK AT STATISTICAL-MODEL IDENTIFICATION [J].
AKAIKE, H .
IEEE TRANSACTIONS ON AUTOMATIC CONTROL, 1974, AC19 (06) :716-723
[2]   Risk bounds for model selection via penalization [J].
Barron, A ;
Birgé, L ;
Massart, P .
PROBABILITY THEORY AND RELATED FIELDS, 1999, 113 (03) :301-413
[3]  
Berlinet A., 1994, PUBLICATIONS I STAT, V38, P3
[4]  
Birge L., 2006, ESAIM-PROBAB STAT, V10, P24, DOI [10.1051/ps:2006001, DOI 10.1051/PS:2006001]
[5]   Optimal dyadic decision trees [J].
Blanchard, G. ;
Schaefer, C. ;
Rozenholc, Y. ;
Mueller, K. -R. .
MACHINE LEARNING, 2007, 66 (2-3) :209-241
[6]   Histograms selection with an Akaike type criterion [J].
Castellan, G .
COMPTES RENDUS DE L ACADEMIE DES SCIENCES SERIE I-MATHEMATIQUE, 2000, 330 (08) :729-732
[7]  
CASTELLAN G, 1999, 9961 U PAR SUD
[8]  
CATONI O, 2002, FDN COMPUTATIONAL MA, P35
[9]   Nonparametric density estimation by exact leave-p-out cross-validation [J].
Celisse, Alain ;
Robin, Stephane .
COMPUTATIONAL STATISTICS & DATA ANALYSIS, 2008, 52 (05) :2350-2368
[10]   ALMOST SURE L1-NORM CONVERGENCE FOR DATA-BASED HISTOGRAM DENSITY ESTIMATES [J].
CHEN, XR ;
ZHAO, LC .
JOURNAL OF MULTIVARIATE ANALYSIS, 1987, 21 (01) :179-188