A U-classifier for high-dimensional data under non-normality

被引:1
|
作者
Ahmad, M. Rauf
Pavlenko, Tatjana
机构
[1] Uppsala Univ, Dept Stat, Uppsala, Sweden
[2] KTH, Royal Inst Technol, Dept Math, Stockholm, Sweden
关键词
Bias-adjusted classifier; High-dimensional classification; U-statistics; LINEAR DISCRIMINANT-ANALYSIS; GENE-EXPRESSION DATA; STATISTICS; MULTICLASS; RULES; TESTS;
D O I
10.1016/j.jmva.2018.05.008
中图分类号
O21 [概率论与数理统计]; C8 [统计学];
学科分类号
020208 ; 070103 ; 0714 ;
摘要
A classifier for two or more samples is proposed when the data are high-dimensional and the distributions may be non-normal. The classifier is constructed as a linear combination of two easily computable and interpretable components, the U-component and the P-component. The U-component is a linear combination of U-statistics of bilinear forms of pairwise distinct vectors from independent samples. The P-component, the discriminant score, is a function of the projection of the U-component on the observation to be classified. Together, the two components constitute an inherently bias-adjusted classifier valid for high-dimensional data. The classifier is linear but its linearity does not rest on the assumption of homoscedasticity. Properties of the classifier and its normal limit are given under mild conditions. Misclassification errors and asymptotic properties of their empirical counterparts are discussed. Simulation results are used to show the accuracy of the proposed classifier for small or moderate sample sizes and large dimensions. Applications involving real data sets are also included. (C) 2018 Elsevier Inc. All rights reserved.
引用
收藏
页码:269 / 283
页数:15
相关论文
共 50 条
  • [1] MULTIPLE COMPARISON PROCEDURES FOR HIGH-DIMENSIONAL DATA AND THEIR ROBUSTNESS UNDER NON-NORMALITY
    Takahashi, Sho
    Hyodo, Masashi
    Nishiyama, Takahiro
    Pavlenko, Tatjana
    JOURNAL JAPANESE SOCIETY OF COMPUTATIONAL STATISTICS, 2013, 26 (01): : 71 - 82
  • [2] Testing diagonality of high-dimensional covariance matrix under non-normality
    Xu, Kai
    JOURNAL OF STATISTICAL COMPUTATION AND SIMULATION, 2017, 87 (16) : 3208 - 3224
  • [3] TEST FOR MEAN MATRIX IN GMANOVA MODEL UNDER HETEROSCEDASTICITY AND NON-NORMALITY FOR HIGH-DIMENSIONAL DATA
    Yamada, Takayuki
    Himeno, Tetsuto
    Tillander, Annika
    Pavlenko, Tatjana
    THEORY OF PROBABILITY AND MATHEMATICAL STATISTICS, 2023, : 129 - 158
  • [4] Testing block-diagonal covariance structure for high-dimensional data under non-normality
    Yamada, Yuki
    Hyodo, Masashi
    Nishiyama, Takahiro
    JOURNAL OF MULTIVARIATE ANALYSIS, 2017, 155 : 305 - 316
  • [5] Methods for high-dimensional multivariate and multi-group repeated measures data under non-normality
    Harrar, Solomon W.
    Hossler, John Z.
    STATISTICS, 2016, 50 (05) : 1056 - 1074
  • [6] A note on mean testing for high dimensional multivariate data under non-normality
    Ahmad, M. Rauf
    von Rosen, Dietrich
    Singull, Martin
    STATISTICA NEERLANDICA, 2013, 67 (01) : 81 - 99
  • [7] Homogeneity test of several high-dimensional covariance matrices for stationary processes under non-normality
    Qayed, Abdullah
    Han, Dong
    COMMUNICATIONS IN STATISTICS-THEORY AND METHODS, 2023, 52 (08) : 2783 - 2798
  • [8] On LR simultaneous test of high-dimensional mean vector and covariance matrix under non-normality
    Niu, Zhenzhen
    Hu, Jiang
    Bai, Zhidong
    Gao, Wei
    STATISTICS & PROBABILITY LETTERS, 2019, 145 : 338 - 344
  • [9] Testing homogeneity of several covariance matrices and multi-sample sphericity for high-dimensional data under non-normality
    Ahmad, M. Rauf
    COMMUNICATIONS IN STATISTICS-THEORY AND METHODS, 2017, 46 (08) : 3738 - 3753
  • [10] Assessing Normality of High-Dimensional Data
    Holgersson, H. E. T.
    Mansoor, Rashid
    COMMUNICATIONS IN STATISTICS-SIMULATION AND COMPUTATION, 2013, 42 (02) : 360 - 369