Automatic bandwidth selection in robust nonparametric regression

被引:1
作者
Assaid, CA [1 ]
Birch, JB
机构
[1] Dupont Merck Pharmaceut Co, Wilmington, DE 19880 USA
[2] Virginia Tech, Dept Stat, Blacksburg, VA 24061 USA
关键词
nonparametric regression; robust regression; Loess; bandwidth selection; smoothing;
D O I
暂无
中图分类号
TP39 [计算机的应用];
学科分类号
081203 ; 0835 ;
摘要
Nonparametric regression techniques have been studied extensively in the literature in recent years due to their flexibility. In addition, robust versions of these techniques have become popular and have been incorporated into some of the standard statistical analysis packages. With new techniques available comes the responsibility of using them properly and in appropriate situations. Often, as in the case presented here, model-fitting diagnostics, such as cross-validation statistics, are not available as tools to determine if the smoothing parameter value being used is preferable to some other arbitrarily chosen value. We present not only a robust nonparametric regression technique that is a strong competitor to the current standard (Loess (Cleveland, 1979)), but also an adjusted cross-validation statistic that can be used to select the bandwidth when it can be assumed that outliers are contained in the data. We present the form of the estimators to be compared, the theoretical bias and variance calculations based on the underlying model that we assume, the cross-validation technique and the rationale for its components, and a simulation study (single regressor case) that is employed to compare the estimators across varying sample sizes and departures of the true curve from the assumed model. The robust local linear regression (RLLR) fitting procedure using the adjusted cross-validation statistic demonstrates superiority over currently available fitting techniques (Loess using the default bandwidth value provided by a popular statistical software package, and M-Regression) when there is a moderate to significant amount of curvature in the true underlying model. In addition, simulations indicate that the proposed technique may be the superior fitting method, based on mean squared error values averaged across simulated fits, for relatively small sample sizes (n less than or equal to 50) across a varying amounts of curvature over Loess and M-Regression.
引用
收藏
页码:259 / 272
页数:14
相关论文
共 21 条
[1]  
ASSAID CA, 1997, THESIS VIRGINIA POLY
[2]   ROBUST LOCALLY WEIGHTED REGRESSION AND SMOOTHING SCATTERPLOTS [J].
CLEVELAND, WS .
JOURNAL OF THE AMERICAN STATISTICAL ASSOCIATION, 1979, 74 (368) :829-836
[3]   A BOUNDED INFLUENCE, HIGH BREAKDOWN, EFFICIENT REGRESSION ESTIMATOR [J].
COAKLEY, CW ;
HETTMANSPERGER, TP .
JOURNAL OF THE AMERICAN STATISTICAL ASSOCIATION, 1993, 88 (423) :872-880
[4]  
FAN JQ, 1994, SCAND J STAT, V21, P433
[5]   DESIGN-ADAPTIVE NONPARAMETRIC REGRESSION [J].
FAN, JQ .
JOURNAL OF THE AMERICAN STATISTICAL ASSOCIATION, 1992, 87 (420) :998-1004
[6]   BAD DATA-ANALYSIS FOR POWER-SYSTEM STATE ESTIMATION [J].
HANDSCHIN, E ;
SCHWEPPE, FC ;
KOHLAS, J ;
FIECHTER, A .
IEEE TRANSACTIONS ON POWER APPARATUS AND SYSTEMS, 1975, PA94 (02) :329-337
[7]   HOW FAR ARE AUTOMATICALLY CHOSEN REGRESSION SMOOTHING PARAMETERS FROM THEIR OPTIMUM [J].
HARDLE, W ;
HALL, P ;
MARRON, JS .
JOURNAL OF THE AMERICAN STATISTICAL ASSOCIATION, 1988, 83 (401) :86-95
[8]   OPTIMAL BANDWIDTH SELECTION IN NONPARAMETRIC REGRESSION FUNCTION ESTIMATION [J].
HARDLE, W ;
MARRON, JS .
ANNALS OF STATISTICS, 1985, 13 (04) :1465-1481
[9]  
Hardle W., 1990, APPL NONPARAMETRIC R, DOI DOI 10.1017/CCOL0521382483
[10]  
Huber P. J., 1981, ROBUST STAT