Asymptotic normality of Gini correlation in high dimension with applications to the K-sample problem

被引:2
作者
Sang, Yongli [1 ]
Dang, Xin [2 ]
机构
[1] Univ Louisiana Lafayette, Dept Math, Lafayette, LA 70504 USA
[2] Univ Mississippi, Dept Math, University, MS 38677 USA
来源
ELECTRONIC JOURNAL OF STATISTICS | 2023年 / 17卷 / 02期
关键词
and phrases; Asymptotic normality; categorical Gini correla-tion; distance correlation; high dimensional K-sample test; 2-SAMPLE TEST; KOLMOGOROV-SMIRNOV; DENSITY-FUNCTIONS; MULTIVARIATE; DISTRIBUTIONS; STATISTICS; DEPENDENCE; ENERGY; TESTS;
D O I
10.1214/23-EJS2165
中图分类号
O21 [概率论与数理统计]; C8 [统计学];
学科分类号
020208 ; 070103 ; 0714 ;
摘要
The categorical Gini correlation proposed by Dang et al. [7] is a dependence measure to characterize independence between categorical and numerical variables. The asymptotic distributions of the sample correlation under dependence and independence have been established when the dimension of the numerical variable is fixed. However, its asymptotic behavior for high dimensional data has not been explored. In this paper, we develop the central limit theorem for the Gini correlation in the more realistic setting where the dimensionality of the numerical variable is diverging. We then construct a powerful and consistent test for the K-sample problem based on the asymptotic normality. The proposed test not only avoids computation burden but also gains power over the permutation procedure. Simulation studies and real data illustrations show that the proposed test is more competitive to existing methods across a broad range of realistic situations, especially in unbalanced cases.
引用
收藏
页码:2539 / 2574
页数:36
相关论文
共 36 条
  • [1] 2-SAMPLE TEST STATISTICS FOR MEASURING DISCREPANCIES BETWEEN 2 MULTIVARIATE PROBABILITY DENSITY-FUNCTIONS USING KERNEL-BASED DENSITY ESTIMATES
    ANDERSON, NH
    HALL, P
    TITTERINGTON, DM
    [J]. JOURNAL OF MULTIVARIATE ANALYSIS, 1994, 50 (01) : 41 - 54
  • [2] On a new multivariate two-sample test
    Baringhaus, L
    Franz, C
    [J]. JOURNAL OF MULTIVARIATE ANALYSIS, 2004, 88 (01) : 190 - 206
  • [3] A nonparametric two-sample test applicable to high dimensional data
    Biswas, Munmun
    Ghosh, Anil K.
    [J]. JOURNAL OF MULTIVARIATE ANALYSIS, 2014, 123 : 160 - 171
  • [4] HYPOTHESIS TESTING IN THE PRESENCE OF MULTIPLE SAMPLES UNDER DENSITY RATIO MODELS
    Cai, Song
    Chen, Jiahua
    Zidek, James V.
    [J]. STATISTICA SINICA, 2017, 27 (02) : 761 - 783
  • [5] A New Graph-Based Two-Sample Test for Multivariate and Object Data
    Chen, Hao
    Friedman, Jerome H.
    [J]. JOURNAL OF THE AMERICAN STATISTICAL ASSOCIATION, 2017, 112 (517) : 397 - 409
  • [6] A rank-based Cramer-von-Mises-type test for two samples
    Curry, Jamye
    Dang, Xin
    Sang, Hailin
    [J]. BRAZILIAN JOURNAL OF PROBABILITY AND STATISTICS, 2019, 33 (03) : 425 - 454
  • [7] A new Gini correlation between quantitative and qualitative variables
    Dang, Xin
    Nguyen, Dao
    Chen, Yixin
    Zhang, Junying
    [J]. SCANDINAVIAN JOURNAL OF STATISTICS, 2021, 48 (04) : 1314 - 1343
  • [8] THE KOLMOGOROV-SMIRNOV, CRAMER-VON MISES TESTS
    DARLING, DA
    [J]. ANNALS OF MATHEMATICAL STATISTICS, 1957, 28 (04): : 823 - 838
  • [9] Dua Dheeru, 2019, UCI machine learning repository
  • [10] A test for the two-sample problem based on empirical characteristic functions
    Fernandez, V. Alba
    Gamero, M. D. Jimenez
    Garcia, J. Munoz
    [J]. COMPUTATIONAL STATISTICS & DATA ANALYSIS, 2008, 52 (07) : 3730 - 3748