Cross-validation Revisited

被引:20
作者
Dutta, Santanu [1 ]
机构
[1] Tezpur Univ, Dept Math Sci, Tezpur, Assam, India
关键词
Density estimation; Least-squares cross-validation; Pseudo-likelihood; 62G07; KERNEL DENSITY-ESTIMATION; BANDWIDTH SELECTION; CONVERGENCE;
D O I
10.1080/03610918.2013.862275
中图分类号
O21 [概率论与数理统计]; C8 [统计学];
学科分类号
020208 ; 070103 ; 0714 ;
摘要
Data-based choice of the bandwidth is an important problem in kernel density estimation. The pseudo-likelihood and the least-squares cross-validation bandwidth selectors are well known, but widely criticized in the literature. For heavy-tailed distributions, the L-1 distance between the pseudo-likelihood-based estimator and the density does not seem to converge in probability to zero with increasing sample size. Even for normal-tailed densities, the rate of L-1 convergence is disappointingly slow. In this article, we report an interesting finding that with minor modifications both the cross-validation methods can be implemented effectively, even for heavy-tailed densities. For both these estimators, the L-1 distance (from the density) are shown to converge completely to zero irrespective of the tail of the density. The expected L-1 distance also goes to zero. These results hold even in the presence of a strongly mixing-type dependence. Monte Carlo simulations and analysis of the Old Faithful geyser data suggest that if implemented appropriately, contrary to the traditional belief, the cross-validation estimators compare well with the sophisticated plug-in and bootstrap-based estimators.
引用
收藏
页码:472 / 490
页数:19
相关论文
共 50 条
  • [31] Fully robust one-sided cross-validation for regression functions
    Olga Y. Savchuk
    Jeffrey D. Hart
    Computational Statistics, 2017, 32 : 1003 - 1025
  • [32] Double one-sided cross-validation of local linear hazards
    Luz Gamiz, Maria
    Mammen, Enno
    Martinez Miranda, Maria Dolores
    Nielsen, Jens Perch
    JOURNAL OF THE ROYAL STATISTICAL SOCIETY SERIES B-STATISTICAL METHODOLOGY, 2016, 78 (04) : 755 - 779
  • [33] Asymptotic comparison of (partial) cross-validation, GCV and randomized GCV in nonparametric regression
    Girard, DA
    ANNALS OF STATISTICS, 1998, 26 (01) : 315 - 334
  • [34] No Cross-Validation Required: An Analytical Framework for Regularized Mixed-Integer Problems
    Soleimani, Behrad
    Khamidehi, Behzad
    Sabbaghian, Maryam
    IEEE COMMUNICATIONS LETTERS, 2020, 24 (12) : 2868 - 2872
  • [35] Nonparametric density estimation by exact leave-p-out cross-validation
    Celisse, Alain
    Robin, Stephane
    COMPUTATIONAL STATISTICS & DATA ANALYSIS, 2008, 52 (05) : 2350 - 2368
  • [36] AN EFFICIENT ALGORITHM FOR THE LEAST-SQUARES CROSS-VALIDATION WITH SYMMETRICAL AND POLYNOMIAL KERNELS
    LEE, BG
    KIM, BC
    COMMUNICATIONS IN STATISTICS-SIMULATION AND COMPUTATION, 1990, 19 (04) : 1513 - 1522
  • [37] On practical implementation of the fully robust one-sided cross-validation method in the nonparametric regression and density estimation contexts
    Savchuk, Olga
    COMPUTATIONAL STATISTICS, 2025,
  • [38] Compressive Sensing with Cross-Validation and Stop-Sampling for Sparse Polynomial Chaos Expansions
    Huan, Xun
    Safta, Cosmin
    Sargsyan, Khachik
    Vane, Zachary P.
    Lacaze, Guilhem
    Oefelein, Joseph C.
    Najm, Habib N.
    SIAM-ASA JOURNAL ON UNCERTAINTY QUANTIFICATION, 2018, 6 (02): : 907 - 936
  • [39] Multiplicative local linear hazard estimation and best one-sided cross-validation
    Luz Gamiz, Maria
    Dolores Martinez-Miranda, Maria
    Perch Nielsen, Jens
    JOURNAL OF MACHINE LEARNING RESEARCH, 2019, 20
  • [40] Choice of V for V-Fold Cross-Validation in Least-Squares Density Estimation
    Arlot, Sylvain
    Lerasle, Matthieu
    JOURNAL OF MACHINE LEARNING RESEARCH, 2016, 17