Pearson's goodness-of-fit tests for sparse distributions

被引:5
作者
Chang, Shuhua [1 ,2 ]
Li, Deli [3 ]
Qi, Yongcheng [4 ]
机构
[1] Yango Univ, Coordinated Innovat Ctr Computable Modeling Manag, Fuzhou, Fujian, Peoples R China
[2] Tianjin Univ Finance & Econ, Coordinated Innovat Ctr Computable Modeling Manag, Tianjin, Peoples R China
[3] Lakehead Univ Thunder Bay, Dept Math Sci, Thunder Bay, ON P7B 5E1, Canada
[4] Univ Minnesota, Dept Math & Stat, Duluth, MN 55812 USA
基金
加拿大自然科学与工程研究理事会; 美国国家科学基金会; 中国国家自然科学基金;
关键词
Goodness-of-fit; discrete distribution; sparse distribution; normal approximation; chi-square approximation; CHI-SQUARE; LIKELIHOOD RATIO; STATISTICS;
D O I
10.1080/02664763.2021.2017413
中图分类号
O21 [概率论与数理统计]; C8 [统计学];
学科分类号
020208 ; 070103 ; 0714 ;
摘要
Pearson's chi-squared test is widely used to test the goodness of fit between categorical data and a given discrete distribution function. When the number of sets of the categorical data, say k, is a fixed integer, Pearson's chi-squared test statistic converges in distribution to a chi-squared distribution with k-1 degrees of freedom when the sample size n goes to infinity. In real applications, the number k often changes with n and may be even much larger than n. By using the martingale techniques, we prove that Pearson's chi-squared test statistic converges to the normal under quite general conditions. We also propose a new test statistic which is more powerful than chi-squared test statistic based on our simulation study. A real application to lottery data is provided to illustrate our methodology.
引用
收藏
页码:1078 / 1093
页数:16
相关论文
共 22 条
[1]   METHODS FOR EXACT GOODNESS-OF-FIT TESTS [J].
BAGLIVO, J ;
OLIVIER, D ;
PAGANO, M .
JOURNAL OF THE AMERICAN STATISTICAL ASSOCIATION, 1992, 87 (418) :464-469
[2]   THE CHI-2 TEST OF GOODNESS OF FIT [J].
COCHRAN, WG .
ANNALS OF MATHEMATICAL STATISTICS, 1952, 23 (03) :315-345
[3]   UNBIASEDNESS OF CHI-SQUARE, LIKELIHOOD RATIO, AND OTHER GOODNESS OF FIT TESTS FOR EQUAL CELL CASE [J].
COHEN, A ;
SACKROWITZ, HB .
ANNALS OF STATISTICS, 1975, 3 (04) :959-964
[4]   PEARSONS-X2 AND THE LOGLIKELIHOOD RATIO STATISTIC-G2 - A COMPARATIVE REVIEW [J].
CRESSIE, N ;
READ, TRC .
INTERNATIONAL STATISTICAL REVIEW, 1989, 57 (01) :19-43
[5]   Testing variance components in balanced linear growth curve models [J].
Drikvandi, Reza ;
Khodadadi, Ahmad ;
Verbeke, Geert .
JOURNAL OF APPLIED STATISTICS, 2012, 39 (03) :563-572
[7]   ASYMPTOTIC NORMALITY AND EFFICIENCY FOR CERTAIN GOODNESS-OF-FIT TESTS [J].
HOLST, L .
BIOMETRIKA, 1972, 59 (01) :137-145
[8]   VALIDITY OF THE CHI-SQUARED TEST WHEN EXPECTED FREQUENCIES ARE SMALL - LIST OF RECENT RESEARCH REFERENCES [J].
HUTCHINSON, TP .
COMMUNICATIONS IN STATISTICS PART A-THEORY AND METHODS, 1979, 8 (04) :327-335
[9]   Estimate-based goodness-of-fit test for large sparse multinomial distributions [J].
Kim, Sung-Ho ;
Choi, Hyemi ;
Lee, Sangjin .
COMPUTATIONAL STATISTICS & DATA ANALYSIS, 2009, 53 (04) :1122-1131
[10]   AN EMPIRICAL-INVESTIGATION OF GOODNESS-OF-FIT STATISTICS FOR SPARSE MULTINOMIALS [J].
KOEHLER, KJ ;
LARNTZ, K .
JOURNAL OF THE AMERICAN STATISTICAL ASSOCIATION, 1980, 75 (370) :336-344