A procedure of linear discrimination analysis with detected sparsity structure for high-dimensional multi-class classification

被引:2
作者
Luo, Shan [1 ]
Chen, Zehua [2 ]
机构
[1] Shanghai Jiao Tong Univ, 800 Dongchuan RD, Shanghai 200240, Peoples R China
[2] Natl Univ Singapore, 3 Sci Dr 2, Singapore 117543, Singapore
关键词
High-dimensionality; Linear discrimination analysis; Misclassification rate; Multi-class discrimination; Sequential procedure; Sparsity; VARIABLE SELECTION; ASYMPTOTICS; CENTROIDS;
D O I
10.1016/j.jmva.2020.104641
中图分类号
O21 [概率论与数理统计]; C8 [统计学];
学科分类号
020208 ; 070103 ; 0714 ;
摘要
In this article, we consider discrimination analyses in high-dimensional cases where the dimension of the predictor vector diverges with the sample size in a theoretical setting. The emphasis is on the case where the number of classes is bigger than two. We first deal with the asymptotic misclassification rates of linear discrimination rules under various conditions. In practical high-dimensional classification problems, it is reasonable to assume certain sparsity conditions on the class means and the common precision matrix. Our theoretical study reveals that with known sparsity structures an asymptotically optimal linear discrimination rule can be constructed. Motivated by the theoretical result, we propose a linear discrimination rule constructed based on estimated sparsity structures which is dubbed as linear discrimination with detected sparsity (LDwDS). The asymptotic optimality of LDwDS is established. Numerical studies are carried out for the comparison of LDwDS with other existing methods. The numerical studies include a comprehensive simulation study and two real data analyses. The numerical studies demonstrate that the LDwDS has an edge in terms of misclassification rate over all the other methods under consideration in the comparison. (C) 2020 Elsevier Inc. All rights reserved.
引用
收藏
页数:20
相关论文
共 31 条
  • [1] Anderson T., 2003, INTRO MULTIVARIATE S
  • [2] Nuclear envelope dystrophies show a transcriptional fingerprint suggesting disruption of Rb-MyoD pathways in muscle regeneration
    Bakay, M
    Wang, ZY
    Melcon, G
    Schiltz, L
    Xuan, JH
    Zhao, P
    Sartorelli, V
    Seo, J
    Pegoraro, E
    Angelini, C
    Shneiderman, B
    Escolar, D
    Chen, YW
    Winokur, ST
    Pachman, LM
    Fan, CG
    Mandler, R
    Nevo, Y
    Gordon, E
    Zhu, YT
    Dong, YB
    Wang, Y
    Hoffman, EP
    [J]. BRAIN, 2006, 129 : 996 - 1013
  • [3] Some theory for Fisher's linear discriminant function, 'naive Bayes', and some alternatives when there are many more variables than observations
    Bickel, PJ
    Levina, E
    [J]. BERNOULLI, 2004, 10 (06) : 989 - 1010
  • [4] Molecular classification of Crohn's disease and ulcerative colitis patients using transcriptional profiles in peripheral blood mononuclear cells
    Burczynski, ME
    Peterson, RL
    Twine, NC
    Zuberek, KA
    Brodeur, BJ
    Casciotti, L
    Maganti, V
    Reddy, PS
    Strahs, A
    Immermann, F
    Spinelli, W
    Schwertschlag, U
    Slager, AM
    Cotreau, MM
    Dorner, AJ
    [J]. JOURNAL OF MOLECULAR DIAGNOSTICS, 2006, 8 (01) : 51 - 61
  • [5] A Direct Estimation Approach to Sparse Linear Discriminant Analysis
    Cai, Tony
    Liu, Weidong
    [J]. JOURNAL OF THE AMERICAN STATISTICAL ASSOCIATION, 2011, 106 (496) : 1566 - 1577
  • [6] Extended Bayesian information criteria for model selection with large model spaces
    Chen, Jiahua
    Chen, Zehua
    [J]. BIOMETRIKA, 2008, 95 (03) : 759 - 771
  • [7] Chen ZH, 2018, ADV CIV ENG, V2018, DOI [10.1109/EMBC.2018.8513089, 10.1155/2018/4064362]
  • [8] Sparse Discriminant Analysis
    Clemmensen, Line
    Hastie, Trevor
    Witten, Daniela
    Ersboll, Bjarne
    [J]. TECHNOMETRICS, 2011, 53 (04) : 406 - 413
  • [9] SURE INDEPENDENCE SCREENING IN GENERALIZED LINEAR MODELS WITH NP-DIMENSIONALITY
    Fan, Jianqing
    Song, Rui
    [J]. ANNALS OF STATISTICS, 2010, 38 (06) : 3567 - 3604
  • [10] HIGH-DIMENSIONAL CLASSIFICATION USING FEATURES ANNEALED INDEPENDENCE RULES
    Fan, Jianqing
    Fan, Yingying
    [J]. ANNALS OF STATISTICS, 2008, 36 (06) : 2605 - 2637