Analyzing clustered count data with a cluster-specific random effect zero-inflated Conway-Maxwell-Poisson distribution

被引:12
作者
Choo-Wosoba, Hyoyoung [1 ]
Datta, Somnath [2 ]
机构
[1] Univ Louisville, Dept Bioinformat & Biostat, Louisville, KY 40202 USA
[2] Univ Florida, Dept Biostat, Gainesville, FL USA
基金
美国国家卫生研究院;
关键词
Gaussian-Hermite (G-H) quadrature; mixed effects model; next-generation sequencing (NGS) data; Poisson distribution; under- and over-dispersions; REGRESSION; MODEL;
D O I
10.1080/02664763.2017.1312299
中图分类号
O21 [概率论与数理统计]; C8 [统计学];
学科分类号
020208 ; 070103 ; 0714 ;
摘要
Count data analysis techniques have been developed in biological and medical research areas. In particular, zero-inflated versions of parametric count distributions have been used to model excessive zeros that are often present in these assays. The most common count distributions for analyzing such data are Poisson and negative binomial. However, a Poisson distribution can only handle equidispersed data and a negative binomial distribution can only cope with overdispersion. However, a Conway-Maxwell-Poisson (CMP) distribution [4] can handle a wide range of dispersion. We show, with an illustrative data set on next-generation sequencing of maize hybrids, that both underdispersion and overdispersion can be present in genomic data. Furthermore, the maize data set consists of clustered observations and, therefore, we develop inference procedures for a zero-inflated CMP regression that incorporates a cluster-specific random effect term. Unlike the Gaussian models, the underlying likelihood is computationally challenging. We use a numerical approximation via a Gaussian quadrature to circumvent this issue. A test for checking zero-inflation has also been developed in our setting. Finite sample properties of our estimators and test have been investigated by extensive simulations. Finally, the statistical methodology has been applied to analyze the maize data mentioned before.
引用
收藏
页码:799 / 814
页数:16
相关论文
共 19 条
[1]   The zero-inflated Conway-Maxwell-Poisson distribution: Bayesian inference, regression modeling and influence diagnostic [J].
Barriga, Gladys D. C. ;
Louzada, Francisco .
STATISTICAL METHODOLOGY, 2014, 21 :23-34
[2]  
Blocker A.W., 2014, FASTGHQUAD FAST RCPP
[3]   Marginal regression models for clustered count data based on zero-inflated Conway-Maxwell-Poisson distribution with applications [J].
Choo-Wosoba, Hyoyoung ;
Levy, Steven M. ;
Datta, Somnath .
BIOMETRICS, 2016, 72 (02) :606-618
[4]  
Conway R.W., 1962, J. Ind. Eng, V12, P132, DOI DOI 10.1198/JCGS.2010.08080
[5]   MIXED MODEL AND ESTIMATING EQUATION APPROACHES FOR ZERO INFLATION IN CLUSTERED BINARY RESPONSE DATA WITH APPLICATION TO A DATING VIOLENCE STUDY [J].
Fulton, Kara A. ;
Liu, Danping ;
Haynie, Denise L. ;
Albert, Paul S. .
ANNALS OF APPLIED STATISTICS, 2015, 9 (01) :275-299
[6]   Zero-inflated Poisson and binomial regression with random effects: A case study [J].
Hall, DB .
BIOMETRICS, 2000, 56 (04) :1030-1039
[7]   Pattern-Mixture Zero-Inflated Mixed Models for Longitudinal Unbalanced Count Data with Excessive Zeros [J].
Hasan, M. Tariqul ;
Sneddon, Gary ;
Ma, Renjun .
BIOMETRICAL JOURNAL, 2009, 51 (06) :946-960
[8]   Score tests for zero-inflated Poisson models [J].
Jansakul, N ;
Hinde, JP .
COMPUTATIONAL STATISTICS & DATA ANALYSIS, 2002, 40 (01) :75-96
[9]  
Kutner M.H., 2003, APPL LINEAR REGRESSI
[10]   On the effect of the number of quadrature points in a logistic random-effects model: an example [J].
Lesaffre, E ;
Spiessens, B .
JOURNAL OF THE ROYAL STATISTICAL SOCIETY SERIES C-APPLIED STATISTICS, 2001, 50 :325-335