Robust and efficient estimation of the mode of continuous data: The mode as a viable measure of central tendency

被引:34
作者
Bickel, DR [1 ]
机构
[1] Med Coll Georgia, Off Biostat & Bioinformat, Augusta, GA 30912 USA
关键词
robust estimation; robust mode; mode estimator; central tendency; measure of location; asymmetry; transformation; efficiency;
D O I
10.1080/0094965031000097809
中图分类号
TP39 [计算机的应用];
学科分类号
081203 ; 0835 ;
摘要
Although a natural measure of the central tendency of a sample of continuous data is its mode the mean and median are the most popular measures of location due to their simplicity and ease of estimation. The median is often used instead of the mean for asymmetric data because it is closer to the mode and is insensitive to extreme values in the sample. However, the mode itself can be reliably estimated by first transforming the data into approximately normal data by raising the values to a real power, and then estimating the mean and standard deviation of the transformed data. With this method, two estimators of the mode of the original data are proposed: a simple estimator based on estimating the mean by the sample mean and the standard deviation by the sample standard deviation, and a more robust estimator based on estimating the mean by the median and the standard deviation by the standardized median absolute deviation. Both of these mode estimators were tested using simulated data drawn from normal (symmetric), lognormal (asymmetric), and Pareto (very asymmetric) distributions. The latter two distributions were chosen to test the generality of the method since they are not power transforms of the normal distribution. Each of the proposed estimators of the mode has a much lower variance than the mean and median for the two asymmetric distributions. When outliers were added to the simulations, the more robust of the two proposed mode estimators had a lower bias and variance than the median for the asymmetric distributions, especially when the level of contamination approached the 50% breakdown point. It is concluded that the mode is often a more reliable measure of location than the mean or median for asymmetric data. The proposed estimators also performed well relative to previous estimators of the mode. While different estimators are better under different conditions, the proposed robust estimator is reliable for a wide variety of distributions and contamination levels.
引用
收藏
页码:899 / 912
页数:14
相关论文
共 16 条
[1]   Robust estimators of the mode and skewness of continuous data [J].
Bickel, DR .
COMPUTATIONAL STATISTICS & DATA ANALYSIS, 2002, 39 (02) :153-163
[2]  
BICKEL DR, 2002, UNPUB FINDING DIFFER
[3]  
BICKEL DR, 2002, UNPUB FAST ROBUST ES
[4]  
Box GEP, 1992, BAYESIAN INFERENCE S, DOI DOI 10.1002/9781118033197.CH4
[5]  
DHARMADHIKANI S, 1988, UNIMODALITY CONVEXIT
[6]  
DONOHO DL, 1983, FETSCHRIFT EL LEHMAN
[7]   SOME DIRECT ESTIMATES OF THE MODE [J].
GRENANDER, U .
ANNALS OF MATHEMATICAL STATISTICS, 1965, 36 (01) :131-138
[8]   THE LENGTH OF THE SHORTH [J].
GRUBEL, R .
ANNALS OF STATISTICS, 1988, 16 (02) :619-628
[9]  
Huber P.J., 1981, ROBUST STAT
[10]  
Meyer MC, 2001, STAT SINICA, V11, P1159