A distribution-free test of independence based on mean variance index

被引:14
作者
Cui, Hengjian [1 ]
Zhong, Wei [2 ,3 ]
机构
[1] Capital Normal Univ, Sch Math Sci, Dept Stat, Beijing 100048, Peoples R China
[2] Xiamen Univ, Sch Econ, Wang Yanan Inst Studies Econ, Dept Stat,MOE Key Lab Econometr, Xiamen 361005, Fujian, Peoples R China
[3] Xiamen Univ, Fujian Key Lab Stat, Xiamen 361005, Fujian, Peoples R China
基金
中国国家自然科学基金;
关键词
Asymptotic null distribution; Cramer-von Mises distance; Conditional distribution function; Mean variance index; Test of independence;
D O I
10.1016/j.csda.2019.05.004
中图分类号
TP39 [计算机的应用];
学科分类号
081203 ; 0835 ;
摘要
A new test based on mean variance (MV) index is proposed for testing the independence between a categorical random variable Y and a continuous one X. The MV index can be considered as the weighted average of Cramer-von Mises distances between the conditional distribution functions of X given each class of Y and the unconditional distribution function of X. The MV index is zero if and only if X and Y are independent. The new MV test between X and Y enjoys several appealing merits. First, an explicit form of the asymptotic null distribution is derived under the independence between X and Y. It provides an efficient way to compute critical values and p-value. Second, no assumption on the distributions of two random variables is required and the new test statistic is invariant under one-to-one transformations of the continuous random variable. It is essentially a rank test and distribution-free, so it is resistant to heavy-tailed distributions and extreme values in practice. Monte Carlo simulations demonstrate its excellent finite-sample performance. In applications, the MV test is used in two high dimensional gene expression data to detect the significant genes associated with tumor types. (C) 2019 Elsevier B.V. All rights reserved.
引用
收藏
页码:117 / 133
页数:17
相关论文
共 18 条
[1]   Broad patterns of gene expression revealed by clustering analysis of tumor and normal colon tissues probed by oligonucleotide arrays [J].
Alon, U ;
Barkai, N ;
Notterman, DA ;
Gish, K ;
Ybarra, S ;
Mack, D ;
Levine, AJ .
PROCEEDINGS OF THE NATIONAL ACADEMY OF SCIENCES OF THE UNITED STATES OF AMERICA, 1999, 96 (12) :6745-6750
[2]   A consistent test of independence based on a sign covariance related to Kendall's tau [J].
Bergsma, Wicher ;
Dassios, Angelos .
BERNOULLI, 2014, 20 (02) :1006-1028
[3]   Model-Free Feature Screening for Ultrahigh Dimenssional Discriminant Analysis [J].
Cui, Hengjian ;
Li, Runze ;
Zhong, Wei .
JOURNAL OF THE AMERICAN STATISTICAL ASSOCIATION, 2015, 110 (510) :630-641
[4]  
Durbin J., 1973, DISTRIBUTION THEORY
[5]  
Hajek J., 1999, THEORY RANK TESTS
[6]   A consistent multivariate test of association based on ranks of distances [J].
Heller, Ruth ;
Heller, Yair ;
Gorfine, Malka .
BIOMETRIKA, 2013, 100 (02) :503-510
[7]   A NON-PARAMETRIC TEST OF INDEPENDENCE [J].
HOEFFDING, W .
ANNALS OF MATHEMATICAL STATISTICS, 1948, 19 (04) :546-557
[8]  
Huang R, 2015, J SYST SCI COMPLEX, V28, P1
[9]  
Kaufman S, 2014, HHG HELLER HELLER GO
[10]   A comprehensive genomic pan-cancer classification using The Cancer Genome Atlas gene expression data [J].
Li, Yuanyuan ;
Kang, Kai ;
Krahn, Juno M. ;
Croutwater, Nicole ;
Lee, Kevin ;
Umbach, David M. ;
Li, Leping .
BMC GENOMICS, 2017, 18