A Marginalized Model for Zero-Inflated, Overdispersed, and Correlated Count Data


Iddi and Molenberghs (2012) merged the attractive features of the so-called combined model of Molenberghs {\em et al\/} (2010) and the marginalized model of Heagerty (1999) for hierarchical non-Gaussian data with overdispersion. In this model, the fixed-effect parameters retain their marginal interpretation. Lee et al (2011) also developed an extension of Heagerty (1999) to handle zero-inflation from count data, using the hurdle model. To bring together all of these features, a marginalized, zero-inflated, overdispersed model for correlated count data is proposed. Using two empirical sets of data, it is shown that the proposed model leads to important improvements in model fit.

DOI Code: 10.1285/i20705948v6n2p149

Keywords: Marginal multilevel model; Maximum likelihood estimation; Random effects model; Negative binomial; Overdispersion; Partial Marginalization; Poisson model; Zero-Inflation.


Aerts, M., Geys, H., Molenberghs, G., and Ryan, L. (2002) Topics in Modelling of Clustered Data. London: Chapman & Hall.

Agresti, A. (2002) Categorical Data Analysis. New York: John Wiley & Sons.

Aregay, M., Shkedy, Z., and Molenberghs, G. (2012) A hierarchical Bayesian approach for the analysis of longitudinal count data with overdispersion: a simulation study. Computational Statistics and Data Analysis, 00, 000-000.

Breslow, N.E. (1984) Extra-Poisson variation in log-linear models. Applied Statistics, 33, 38-44.

Breslow, N.E. and Clayton, D.G. (1993) Approximate inference in generalized linear mixed models. Journal of the American Statistical Association, 88, 9-25.

Engel, B. and Keen, A. (1994) A simple approach for the analysis of generalized linear mixed models. Statistica Neerlandica, 48, 1-22.

Faught, E., Wilder, B.J., Ramsay, R.E., Reife, R.A, Kramer, L.D., Pledger, G.W., and Karim, R.M. (1996) Topiramate placebo-controlled dose-ranging trial in refractory partial epilepsy using 200-, 400-, and 600-mg daily dosage. Neurology, 46, 1684-1690.

Fitzmaurice, G.M. and Laird, N.M. (1993) A likelihood based method for analysing longitudinal binary responses. Biometrika. 80, 141-151.

Hall, D.B. (2000) Zero-inflated Poisson and binomial regression with random effects: a case study. Biometrics, 56, 1030-1039.

Hall, D.B. and Zhang, Z. (2004) Marginal models for zero inflated clustered data. Statistical Modelling, 4, 161-180.

Heagerty, P.J. (1999) Marginally specified logistic-normal models for longitudinal binary data. Biometrics, 55, 688-698.

Heagerty, P.J. and Zeger, S.L. (2000) Marginalized multilevel models and likelihood inference (with comments and a rejoinder by the authors). Statistical Science, 15, 1-26.

Iddi, S. and Molenberghs. G. (2012) A combined overdispersed and marginalized multilevel model. Computational Statistics and Data Analysis, 56, 1944-1951.

Kassahun, W., Neyens, T., Faes, C., Molenberghs, G. and Verbeke, G. (2012) A zero-inflated overdispersed hierarchical Poisson model. Submitted for publication.

Laird, N.M., and Ware, J.H. (1982) Random effects models for longitudinal data. Biometrics, 38, 963-974.

Lambert, D. (1992) Zero-Inflated Poisson regression, with an application to defects in manufacturing. Technometrics, 34, 1-13.

Lawless, J. (1987) Negative binomial and mixed Poisson regression. The Canadian Journal of Statistics, 15, 209-225.

Lee, K., Joo, Y., Song, J.J., and Harper, D.W. (2011) Analysis of zero-inflated clustered count data: a marginalized model approach. Computational Statistics and Data Analysis, 55, 824-837.

Molenberghs, G. and Lesaffre, E. (1994) Marginal modelling of correlated ordinal data using a multivariate Plackett distribution. Journal of the American Statistical Association, 89, 633-644.

Molenberghs, G., Verbeke, G., and Demétrio, C. (2007) An extended random-effects approach to modeling repeated, overdispersed count data. Lifetime Data Analysis, 13, 513-531.

Molenberghs, G., Verbeke, G., Demétrio, C., and Vieira, A. (2010) A family of generalized linear models for repeated measures with normal and conjugate random effects. Statistical Science, 25, 325-347.

Molenberghs, G. and Verbeke, G. (2005) Models for Discrete Longitudinal Data. New York: Springer.

Min, Y. and Agresti, A. (2005) Random effect models for repeated measures of zero-inflated count data. Statistical Modelling, 5, 1-19.

Mullahy, J. (1986) Specification and testing of some modified count data models. Journal of Econometrics, 33, 341-365.

Nelder, J.A. and Wedderburn, R.W.M. (1972) Generalized linear models. Journal of the Royal Statistical Society, Series B, 135, 370-384.

Pinheiro, J.C. and Bates, D.M. (1995) Approximations to the log-likelihood function in the nonlinear mixed-effects model. Journal of Computation and Graphical Statistics. 4, 12-35.

Pinheiro, J.C., and Bates, D.M. (2000) Mixed Effects Models in S and S-Plus. New York: Springer-Verlag.

Ridout, M., Hinde, J. and Demetrio, C.G.B. (2001) A score test for a zero-inflated Poisson regression model against zero-inflated negative binomial alternatives. Biometrics. 57, 219-233.

van Iersel, M., Oetting, R., and Hall, D. B. (2001). Imidicloprid applications by subirrigation for control of silverleaf whitefly on poinsettia. Journal of Economic Entomology. 94, 666-672.

Yau, K.K.W. and Lee, A.H. (2001) Zero-inflated Poisson regression with random effects to evaluate an occupational injury prevention programme. Statistics in Medicine, 20, 2907-2920.

Zeger, S.L., Liang, K.-Y., and Albert, P.S. (1988) Models for longitudinal data: a generalized estimating equation approach. Biometrics, 44, 1049-1060.

Full Text: pdf

Creative Commons License
This work is licensed under a Creative Commons Attribuzione - Non commerciale - Non opere derivate 3.0 Italia License.