A U-classifier for high-dimensional data under non-normality

被引:1
作者
Ahmad, M. Rauf
Pavlenko, Tatjana
机构
[1] Uppsala Univ, Dept Stat, Uppsala, Sweden
[2] KTH, Royal Inst Technol, Dept Math, Stockholm, Sweden
关键词
Bias-adjusted classifier; High-dimensional classification; U-statistics; LINEAR DISCRIMINANT-ANALYSIS; GENE-EXPRESSION DATA; STATISTICS; MULTICLASS; RULES; TESTS;
D O I
10.1016/j.jmva.2018.05.008
中图分类号
O21 [概率论与数理统计]; C8 [统计学];
学科分类号
020208 ; 070103 ; 0714 ;
摘要
A classifier for two or more samples is proposed when the data are high-dimensional and the distributions may be non-normal. The classifier is constructed as a linear combination of two easily computable and interpretable components, the U-component and the P-component. The U-component is a linear combination of U-statistics of bilinear forms of pairwise distinct vectors from independent samples. The P-component, the discriminant score, is a function of the projection of the U-component on the observation to be classified. Together, the two components constitute an inherently bias-adjusted classifier valid for high-dimensional data. The classifier is linear but its linearity does not rest on the assumption of homoscedasticity. Properties of the classifier and its normal limit are given under mild conditions. Misclassification errors and asymptotic properties of their empirical counterparts are discussed. Simulation results are used to show the accuracy of the proposed classifier for small or moderate sample sizes and large dimensions. Applications involving real data sets are also included. (C) 2018 Elsevier Inc. All rights reserved.
引用
收藏
页码:269 / 283
页数:15
相关论文
共 50 条
[11]   On Criticality in High-Dimensional Data [J].
Saremi, Saeed ;
Sejnowski, Terrence J. .
NEURAL COMPUTATION, 2014, 26 (07) :1329-1339
[12]   High-Dimensional Data Bootstrap [J].
Chernozhukov, Victor ;
Chetverikov, Denis ;
Kato, Kengo ;
Koike, Yuta .
ANNUAL REVIEW OF STATISTICS AND ITS APPLICATION, 2023, 10 :427-449
[13]   ADAPTIVE CHANGE POINT MONITORING FOR HIGH-DIMENSIONAL DATA [J].
Wu, Teng ;
Wang, Runmin ;
Yan, Hao ;
Shao, Xiaofeng .
STATISTICA SINICA, 2022, 32 (03) :1583-1610
[14]   Adaptive Testing for High-Dimensional Data [J].
Zhang, Yangfan ;
Wang, Runmin ;
Shao, Xiaofeng .
JOURNAL OF THE AMERICAN STATISTICAL ASSOCIATION, 2025,
[15]   High-dimensional Linear Discriminant Analysis Classifier for Spiked Covariance Model [J].
Sifaou, Houssem ;
Kammoun, Abla ;
Alouini, Mohamed-Slim .
JOURNAL OF MACHINE LEARNING RESEARCH, 2020, 21
[16]   Ratio-consistency of some invariant U-statistic-based estimators with an application to high-dimensional data ranking [J].
Guo, Jia ;
Zhou, Bu .
SCANDINAVIAN JOURNAL OF STATISTICS, 2025,
[17]   High-dimensional MANOVA under weak conditions [J].
Kong, Xiaoli ;
Harrar, Solomon W. .
STATISTICS, 2021, 55 (02) :321-349
[18]   TESTING FOR GROUP STRUCTURE IN HIGH-DIMENSIONAL DATA [J].
McLachlan, G. J. ;
Rathnayake, Suren I. .
JOURNAL OF BIOPHARMACEUTICAL STATISTICS, 2011, 21 (06) :1113-1125
[19]   An effective clustering scheme for high-dimensional data [J].
He, Xuansen ;
He, Fan ;
Fan, Yueping ;
Jiang, Lingmin ;
Liu, Runzong ;
Maalla, Allam .
MULTIMEDIA TOOLS AND APPLICATIONS, 2023, 83 (15) :45001-45045
[20]   A Review of Causal Methods for High-Dimensional Data [J].
Berkessa, Zewude A. ;
Laara, Esa ;
Waldmann, Patrik .
IEEE ACCESS, 2025, 13 :11892-11917