Application of detecting and taking overdispersion into account in Poisson regression model

被引:13
作者
Bouche, G. [1 ,2 ]
Lepage, B. [1 ]
Migeot, V. [1 ]
Ingrand, P. [2 ]
机构
[1] Univ Poitiers, CHU Poitiers, Unite Evaluat Med Pole Pharm & Sante Publ, F-86021 Poitiers, France
[2] Univ Poitiers, CHU Poitiers, INSERM, Ctr Invest Clin,CIC 802, F-86021 Poitiers, France
来源
REVUE D EPIDEMIOLOGIE ET DE SANTE PUBLIQUE | 2009年 / 57卷 / 04期
关键词
Poisson distribution; Statistical models; Data interpretation; Epidemiology; ZERO; RATES;
D O I
10.1016/j.respe.2009.02.209
中图分类号
R1 [预防医学、卫生学];
学科分类号
1004 ; 120402 ;
摘要
Background. - Researchers often use the Poisson regression model to analyze count data. Overdispersion can occur when a Poisson regression model is used, resulting in an underestimation of variance of the regression model parameters. Our objective was to take overdispersion into account and assess its impact with an illustration based on the data of a study investigating the relationship between use of the Internet to seek health information and number of primary care consultations. Methods. - Three methods, overdispersed Poisson, a robust estimator, and negative binomial regression, were performed to take overdispersion into account in explaining variation in the number (Y) of primary care consultations. We tested overdispersion in the Poisson regression model using the ratio of the sum of Pearson residuals over the number of degrees of freedom (chi(2)/df). We then fitted the three models and compared parameter estimation to the estimations given by Poisson regression model. Results. - Variance of the number of primary care consultations (Var[Y] = 21.03) was greater than the mean (E[Y] = 5.93) and the chi(2)/df ratio was 3.26, which confirmed overdispersion. Standard errors of the parameters varied greatly between the Poisson regression model and the three other regression models. Interpretation of estimates from two variables (using the Internet to seek health information and single parent family) would have changed according to the model retained, with significant levels of 0.06 and 0.002 (Poisson), 0.29 and 0.09 (overdispersed Poisson), 0.29 and 0.13 (use of a robust estimator) and 0.45 and 0.13 (negative binomial) respectively. Conclusion. - Different methods exist to solve the problem of underestimating variance in the Poisson regression model when overdispersion is present. The negative binomial regression model seems to be particularly accurate because of its theorical distribution; in addition this regression is easy to perform with ordinary statistical software packages. (C) 2009 Elsevier Masson SAS. All rights reserved.
引用
收藏
页码:285 / 296
页数:12
相关论文
共 33 条