Ultrahigh-Dimensional Multiclass Linear Discriminant Analysis by Pairwise Sure Independence Screening

被引:59
作者
Pan, Rui [1 ]
Wang, Hansheng [2 ]
Li, Runze [3 ,4 ]
机构
[1] Cent Univ Finance & Econ, Sch Math & Stat, Beijing 100081, Peoples R China
[2] Peking Univ, Guanghua Sch Management, Beijing 100871, Peoples R China
[3] Penn State Univ, Dept Stat, University Pk, PA 16802 USA
[4] Penn State Univ, Methodol Ctr, University Pk, PA 16802 USA
基金
中国国家自然科学基金;
关键词
Feature screening; Multiclass classification; Strong screening consistency; CLASSIFICATION;
D O I
10.1080/01621459.2014.998760
中图分类号
O21 [概率论与数理统计]; C8 [统计学];
学科分类号
020208 ; 070103 ; 0714 ;
摘要
This article is concerned with the problem of feature screening for multiclass linear discriminant analysis under ultrahigh-dimensional setting. We allow the number of classes to be relatively large. As a result, the total number of relevant features is larger than usual. This makes the related classification problem much more challenging than the conventional one, where the number of classes is small (very often two). To solve the problem, we propose a novel pairwise sure independence screening method for linear discriminant analysis with an ultrahigh-dimensional predictor. The proposed procedure is directly applicable to the situation with many classes. We further prove that the proposed method is screening consistent. Simulation studies are conducted to assess the finite sample performance of the new procedure. We also demonstrate the proposed methodology via an empirical analysis of a real life example on handwritten Chinese character recognition.
引用
收藏
页码:169 / 179
页数:11
相关论文
共 24 条
[1]  
Akaike H., 1998, Selected papers of Hirotugu Akaike, P199, DOI [10.1007/978-1-4612-1694-0_15, DOI 10.1007/978-1-4612-1694-0_15]
[2]  
[Anonymous], 2012, BIOMETRIKA, V99, P15
[3]   Regularized estimation of large covariance matrices [J].
Bickel, Peter J. ;
Levina, Elizaveta .
ANNALS OF STATISTICS, 2008, 36 (01) :199-227
[4]   Some theory for Fisher's linear discriminant function, 'naive Bayes', and some alternatives when there are many more variables than observations [J].
Bickel, PJ ;
Levina, E .
BERNOULLI, 2004, 10 (06) :989-1010
[5]   A Direct Estimation Approach to Sparse Linear Discriminant Analysis [J].
Cai, Tony ;
Liu, Weidong .
JOURNAL OF THE AMERICAN STATISTICAL ASSOCIATION, 2011, 106 (496) :1566-1577
[6]   Extended Bayesian information criteria for model selection with large model spaces [J].
Chen, Jiahua ;
Chen, Zehua .
BIOMETRIKA, 2008, 95 (03) :759-771
[7]   Sparse Discriminant Analysis [J].
Clemmensen, Line ;
Hastie, Trevor ;
Witten, Daniela ;
Ersboll, Bjarne .
TECHNOMETRICS, 2011, 53 (04) :406-413
[8]   Sure independence screening for ultrahigh dimensional feature space [J].
Fan, Jianqing ;
Lv, Jinchi .
JOURNAL OF THE ROYAL STATISTICAL SOCIETY SERIES B-STATISTICAL METHODOLOGY, 2008, 70 :849-883
[9]  
Fan JQ, 2012, J ROY STAT SOC B, V74, P745, DOI 10.1111/j.1467-9868.2012.01029.x
[10]   Nonparametric Independence Screening in Sparse Ultra-High-Dimensional Additive Models [J].
Fan, Jianqing ;
Feng, Yang ;
Song, Rui .
JOURNAL OF THE AMERICAN STATISTICAL ASSOCIATION, 2011, 106 (494) :544-557