Improved Stein-type shrinkage estimators for the high-dimensional multivariate normal covariance matrix

被引:68
作者
Fisher, Thomas J. [1 ]
Sun, Xiaoqian [2 ]
机构
[1] Univ Missouri, Dept Math & Stat, Kansas City, MO 64110 USA
[2] Clemson Univ, Dept Math Sci, Clemson, SC 29634 USA
关键词
Covariance matrix; Shrinkage estimation; High-dimensional data analysis; GENE-EXPRESSION DATA; CLASSIFICATION; TUMOR;
D O I
10.1016/j.csda.2010.12.006
中图分类号
TP39 [计算机的应用];
学科分类号
081203 ; 0835 ;
摘要
Many applications require an estimate for the covariance matrix that is non-singular and well-conditioned. As the dimensionality increases, the sample covariance matrix becomes ill-conditioned or even singular. A common approach to estimating the covariance matrix when the dimensionality is large is that of Stein-type shrinkage estimation. A convex combination of the sample covariance matrix and a well-conditioned target matrix is used to estimate the covariance matrix. Recent work in the literature has shown that an optimal combination exists under mean-squared loss, however it must be estimated from the data. In this paper, we introduce a new set of estimators for the optimal convex combination for three commonly used target matrices. A simulation study shows an improvement over those in the literature in cases of extreme high-dimensionality of the data. A data analysis shows the estimators are effective in a discriminant and classification analysis. (C) 2010 Elsevier B.V. All rights reserved.
引用
收藏
页码:1909 / 1918
页数:10
相关论文
共 25 条
[1]   Broad patterns of gene expression revealed by clustering analysis of tumor and normal colon tissues probed by oligonucleotide arrays [J].
Alon, U ;
Barkai, N ;
Notterman, DA ;
Gish, K ;
Ybarra, S ;
Mack, D ;
Levine, AJ .
PROCEEDINGS OF THE NATIONAL ACADEMY OF SCIENCES OF THE UNITED STATES OF AMERICA, 1999, 96 (12) :6745-6750
[2]   LIMIT OF THE SMALLEST EIGENVALUE OF A LARGE DIMENSIONAL SAMPLE COVARIANCE-MATRIX [J].
BAI, ZD ;
YIN, YQ .
ANNALS OF PROBABILITY, 1993, 21 (03) :1275-1294
[3]   Regularized estimation of large covariance matrices [J].
Bickel, Peter J. ;
Levina, Elizaveta .
ANNALS OF STATISTICS, 2008, 36 (01) :199-227
[4]   SHRINKAGE ESTIMATION OF HIGH DIMENSIONAL COVARIANCE MATRICES [J].
Chen, Yilun ;
Wiesel, Ami ;
Hero, Alfred O., III .
2009 IEEE INTERNATIONAL CONFERENCE ON ACOUSTICS, SPEECH, AND SIGNAL PROCESSING, VOLS 1- 8, PROCEEDINGS, 2009, :2937-2940
[5]  
DEMPSTER A.P., 1969, ELEMENTS CONTINUOUS
[6]   Boosting for tumor classification with gene expression data [J].
Dettling, M ;
Bühlmann, P .
BIOINFORMATICS, 2003, 19 (09) :1061-1069
[7]   Comparison of discrimination methods for the classification of tumors using gene expression data [J].
Dudoit, S ;
Fridlyand, J ;
Speed, TP .
JOURNAL OF THE AMERICAN STATISTICAL ASSOCIATION, 2002, 97 (457) :77-87
[8]   STEINS PARADOX IN STATISTICS [J].
EFRON, B ;
MORRIS, C .
SCIENTIFIC AMERICAN, 1977, 236 (05) :119-127
[9]   BIASED VERSUS UNBIASED ESTIMATION [J].
EFRON, B .
ADVANCES IN MATHEMATICS, 1975, 16 (03) :259-277
[10]   DATA-ANALYSIS USING STEINS ESTIMATOR AND ITS GENERALIZATIONS [J].
EFRON, B ;
MORRIS, C .
JOURNAL OF THE AMERICAN STATISTICAL ASSOCIATION, 1975, 70 (350) :311-319