Asymptotic normality of Gini correlation in high dimension with applications to the K-sample problem

被引:2
作者
Sang, Yongli [1 ]
Dang, Xin [2 ]
机构
[1] Univ Louisiana Lafayette, Dept Math, Lafayette, LA 70504 USA
[2] Univ Mississippi, Dept Math, University, MS 38677 USA
来源
ELECTRONIC JOURNAL OF STATISTICS | 2023年 / 17卷 / 02期
关键词
and phrases; Asymptotic normality; categorical Gini correla-tion; distance correlation; high dimensional K-sample test; 2-SAMPLE TEST; KOLMOGOROV-SMIRNOV; DENSITY-FUNCTIONS; MULTIVARIATE; DISTRIBUTIONS; STATISTICS; DEPENDENCE; ENERGY; TESTS;
D O I
10.1214/23-EJS2165
中图分类号
O21 [概率论与数理统计]; C8 [统计学];
学科分类号
020208 ; 070103 ; 0714 ;
摘要
The categorical Gini correlation proposed by Dang et al. [7] is a dependence measure to characterize independence between categorical and numerical variables. The asymptotic distributions of the sample correlation under dependence and independence have been established when the dimension of the numerical variable is fixed. However, its asymptotic behavior for high dimensional data has not been explored. In this paper, we develop the central limit theorem for the Gini correlation in the more realistic setting where the dimensionality of the numerical variable is diverging. We then construct a powerful and consistent test for the K-sample problem based on the asymptotic normality. The proposed test not only avoids computation burden but also gains power over the permutation procedure. Simulation studies and real data illustrations show that the proposed test is more competitive to existing methods across a broad range of realistic situations, especially in unbalanced cases.
引用
收藏
页码:2539 / 2574
页数:36
相关论文
共 36 条
  • [31] Szkely G. J., 2004, InterStat, V5, P1249
  • [32] Objective Automatic Assessment of Rehabilitative Speech Treatment in Parkinson's Disease
    Tsanas, Athanasios
    Little, Max A.
    Fox, Cynthia
    Ramig, Lorraine O.
    [J]. IEEE TRANSACTIONS ON NEURAL SYSTEMS AND REHABILITATION ENGINEERING, 2014, 22 (01) : 181 - 190
  • [33] Testing homogeneity for multiple nonnegative distributions with excess zero observations
    Wang, Chunlin
    Marriott, Paul
    Li, Pengfei
    [J]. COMPUTATIONAL STATISTICS & DATA ANALYSIS, 2017, 114 : 146 - 157
  • [34] Estimating Feature-Label Dependence Using Gini Distance Statistics
    Zhang, Silu
    Dang, Xin
    Nguyen, Dao
    Wilkins, Dawn
    Chen, Yixin
    [J]. IEEE TRANSACTIONS ON PATTERN ANALYSIS AND MACHINE INTELLIGENCE, 2021, 43 (06) : 1947 - 1963
  • [35] Interpoint distance based two sample tests in high dimension
    Zhu, Changbo
    Shao, Xiaofeng
    [J]. BERNOULLI, 2021, 27 (02) : 1189 - 1211
  • [36] DISTANCE-BASED AND RKHS-BASED DEPENDENCE METRICS IN HIGH DIMENSION
    Zhu, Changbo
    Zhang, Xianyang
    Yao, Shun
    Shao, Xiaofeng
    [J]. ANNALS OF STATISTICS, 2020, 48 (06) : 3366 - 3394