Far Casting Cross-Validation

被引:10
作者
Carmack, Patrick S. [1 ]
Schucany, William R.
Spence, Jeffrey S. [2 ]
Gunst, Richard F. [3 ]
Lin, Qihua [2 ]
Haley, Robert W. [2 ]
机构
[1] Univ Cent Arkansas, Dept Math, Conway, AR 72035 USA
[2] Univ Texas SW Med Ctr Dallas, Div Epidemiol, Dept Internal Med, Dallas, TX 75390 USA
[3] So Methodist Univ, Dept Stat Sci, Dallas, TX 75275 USA
关键词
Dependent data; Optimistic error rates; Prediction; Temporal correlation; Tuning parameter; DEPENDENT DATA;
D O I
10.1198/jcgs.2009.07034
中图分类号
O21 [概率论与数理统计]; C8 [统计学];
学科分类号
020208 ; 070103 ; 0714 ;
摘要
Cross-validation has long been used for choosing tuning parameters and other model selection tasks. It generally performs well provided the data are independent, or nearly so. Improvements have been Suggested which address ordinary cross-validation's (OCV) shortcomings in correlated data. Whereas these techniques have merit, they can still lead to poor model selection in correlated data or are not readily generalizable to high-dimensional data. The proposed solution, far casting cross-validation (FCCV), addresses these problems. FCCV withholds correlated neighbors in every aspect of the cross-validation procedure. The result is a technique that stresses a fitted model's ability to extrapolate rather than interpolate. This generally leads to better model selection in correlated datasets. Whereas FCCV is less than optimal in the independence case, our improvement of OCV applies more generally to higher dimensional error processes and to both parametric and nonparametric model selection problems. To facilitate introduction, we consider only one application, namely estimating global bandwidths for curve estimation with local linear regression. We provide theoretical motivation and report some comparative results from a simulation experiment and on a time series of annual global temperature deviations. For such data, FCCV generally has lower average squared error when disturbances are correlated. Supplementary materials are available online.
引用
收藏
页码:879 / 893
页数:15
相关论文
共 15 条
[1]   A CROSS-VALIDATORY METHOD FOR DEPENDENT DATA [J].
BURMAN, P ;
CHOW, E ;
NOLAN, D .
BIOMETRIKA, 1994, 81 (02) :351-358
[2]  
CARMACK P, 2004, THESIS SO METHODIST
[3]   NON-PARAMETRIC ESTIMATION OF A MULTIVARIATE PROBABILITY DENSITY [J].
EPANECHN.VA .
THEORY OF PROBILITY AND ITS APPLICATIONS,USSR, 1969, 14 (01) :153-&
[4]   DESIGN-ADAPTIVE NONPARAMETRIC REGRESSION [J].
FAN, JQ .
JOURNAL OF THE AMERICAN STATISTICAL ASSOCIATION, 1992, 87 (420) :998-1004
[5]   A FLEXIBLE AND FAST METHOD FOR AUTOMATIC SMOOTHING [J].
GASSER, T ;
KNEIP, A ;
KOHLER, W .
JOURNAL OF THE AMERICAN STATISTICAL ASSOCIATION, 1991, 86 (415) :643-652
[6]   PREDICTIVE SAMPLE REUSE METHOD WITH APPLICATIONS [J].
GEISSER, S .
JOURNAL OF THE AMERICAN STATISTICAL ASSOCIATION, 1975, 70 (350) :320-328
[7]   GLOBAL TRENDS OF MEASURED SURFACE AIR-TEMPERATURE [J].
HANSEN, J ;
LEBEDEFF, S .
JOURNAL OF GEOPHYSICAL RESEARCH-ATMOSPHERES, 1987, 92 (D11) :13345-13372
[8]   Robustness of one-sided cross-validation to autocorrelation [J].
Hart, JD ;
Lee, CL .
JOURNAL OF MULTIVARIATE ANALYSIS, 2005, 92 (01) :77-96
[9]   One-sided cross-validation [J].
Hart, JD ;
Yi, S .
JOURNAL OF THE AMERICAN STATISTICAL ASSOCIATION, 1998, 93 (442) :620-631
[10]  
Hastie T., 2008, ELEMENTS STAT LEARNI, V2nd