Assessing goodness of fit of generalized linear models to sparse data using higher order moment corrections

被引:4
作者
Sudhir R. Paul
Dianliang Deng
机构
[1] University of Windsor, Windsor
[2] University of Regina, Regina
基金
加拿大自然科学与工程研究理事会;
关键词
Edgeworth approximation; sparse data; generalized linear model; binomial and Poisson model; Primary 62J12, 62E17; Secondary 62F05;
D O I
10.1007/s13571-012-0037-0
中图分类号
学科分类号
摘要
The purpose of this paper is to assess goodness of fit properties of the conditional distribution of the Pearson statistic, for noncanonical generalized linear models with data that are extensive but sparse, by Edgeworth approximation of p-values using higher order moment corrections. In this paper we obtain approximations to the fourth moments of the unconditional and conditional distribution of the modified Pearson statistic with noncanonical links. This extends previous results where approximations to the first three moments are available and completes all usual higher order moment calculations of the modified Pearson statistic. We consider the asymptotic limit in which the data are extensive but sparse and a supplementary estimating equation for the dispersion parameter. Specific results for binomial and Poisson data are obtained separately. The methods for assessing goodness of fit using higher order moments are discussed. For testing goodness of fit of generalized linear models to sparse data some simulations are conducted to compare, in terms of empirical size and power, the performance of the classical Pearson statistic (X2), a standardized modified Pearson statistic (X∗), a standardized modified deviance statistic (D∗), a modified Pearson statistic based on Edgeworth approximation with the first three conditional moments (Z1), and a modified Pearson statistic based on Edgeworth approximation with the first four conditional moments (Z2). The statistic Z2 holds level most effectively in most situations and has some power advantage. © 2012, Indian Statistical Institute.
引用
收藏
页码:195 / 210
页数:15
相关论文
共 8 条
  • [1] Farrington C.P., On assessing goodness of fit of generalized linear models to sparse data, J. R. Stat. Soc. Ser. B Stat. Methodol., 58, pp. 349-360, (1996)
  • [2] Koehler K.J., Goodness-of-fit tests for log-linear models in sparse contingency tables, J. Amer. Statist. Assoc., 81, pp. 483-493, (1986)
  • [3] Koehler K.J., Larntz K., An empirical investigation of goodness-of-fit statistics for sparse multinomials, J. Amer. Statist. Assoc., 75, pp. 336-344, (1980)
  • [4] McCullagh P., Tensor natation and cumulants of polynomials, Biometrika, 71, pp. 461-476, (1984)
  • [5] McCullagh P., On the asymptotic distribution of Pearson’s statistic in linear exponential family models, International Statistical Review, 53, pp. 61-67, (1985)
  • [6] McCullagh P., The conditional distribution of goodness-of-fit statistics for discrete data, J. Amer. Statist. Assoc., 81, pp. 104-107, (1986)
  • [7] Moore D.F., Asymptotic properties of moments for overdispersed counts and proportions, Biometrika, 73, pp. 583-588, (1986)
  • [8] Paul S.R., Deng D., Goodness of fit of generalized linear models to sparse data, J. R. Stat. Soc. Ser. B, 62, pp. 323-333, (2000)