IDEAS home Printed from https://ideas.repec.org/a/eee/csdana/v99y2016icp68-80.html
   My bibliography  Save this article

A flexible zero-inflated model to address data dispersion

Author

Listed:
  • Sellers, Kimberly F.
  • Raim, Andrew

Abstract

Excess zeroes are often thought of as a cause of data over-dispersion (i.e. when the variance exceeds the mean); this claim is not entirely accurate. In actuality, excess zeroes reduce the mean of a dataset, thus inflating the dispersion index (i.e. the variance divided by the mean). While this results in an increased chance for data over-dispersion, the implication is not guaranteed. Thus, one should consider a flexible distribution that not only can account for excess zeroes, but can also address potential over- or under-dispersion. A zero-inflated Conway–Maxwell–Poisson (ZICMP) regression allows for modeling the relationship between explanatory and response variables, while capturing the effects due to excess zeroes and dispersion. This work derives the ZICMP model and illustrates its flexibility, extrapolates the corresponding likelihood ratio test for the presence of significant data dispersion, and highlights various statistical properties and model fit through several examples.

Suggested Citation

  • Sellers, Kimberly F. & Raim, Andrew, 2016. "A flexible zero-inflated model to address data dispersion," Computational Statistics & Data Analysis, Elsevier, vol. 99(C), pages 68-80.
  • Handle: RePEc:eee:csdana:v:99:y:2016:i:c:p:68-80
    DOI: 10.1016/j.csda.2016.01.007
    as

    Download full text from publisher

    File URL: http://www.sciencedirect.com/science/article/pii/S0167947316000165
    Download Restriction: Full text for ScienceDirect subscribers only.

    File URL: https://libkey.io/10.1016/j.csda.2016.01.007?utm_source=ideas
    LibKey link: if access is restricted and if your library uses this service, LibKey will redirect you to where you can use your library subscription to access this item
    ---><---

    As the access to this document is restricted, you may want to search for a different version of it.

    References listed on IDEAS

    as
    1. Galit Shmueli & Thomas P. Minka & Joseph B. Kadane & Sharad Borle & Peter Boatwright, 2005. "A useful distribution for fitting discrete data: revival of the Conway–Maxwell–Poisson distribution," Journal of the Royal Statistical Society Series C, Royal Statistical Society, vol. 54(1), pages 127-142, January.
    2. Rothenberg, Thomas J, 1971. "Identification in Parametric Models," Econometrica, Econometric Society, vol. 39(3), pages 577-591, May.
    3. Daniel B. Hall, 2000. "Zero-Inflated Poisson and Binomial Regression with Random Effects: A Case Study," Biometrics, The International Biometric Society, vol. 56(4), pages 1030-1039, December.
    Full references (including those not matched with items on IDEAS)

    Citations

    Citations are extracted by the CitEc Project, subscribe to its RSS feed for this item.
    as


    Cited by:

    1. Maria De Jesus & Nora Sullivan & William Hopman & Alex Martinez & Paul David Glenn & Saviour Msopa & Brooke Milligan & Noah Doney & William Howell & Kimberly Sellers & Monica C. Jackson, 2023. "Examining the Role of Quality of Institutionalized Healthcare on Maternal Mortality in the Dominican Republic," IJERPH, MDPI, vol. 20(14), pages 1-11, July.
    2. Wang, Fan & Li, Heng & Dong, Chao, 2021. "Understanding near-miss count data on construction sites using greedy D-vine copula marginal regression," Reliability Engineering and System Safety, Elsevier, vol. 213(C).
    3. Somayeh Ghorbani Gholiabad & Abbas Moghimbeigi & Javad Faradmal, 2021. "Three-level zero-inflated Conway–Maxwell–Poisson regression model for analyzing dispersed clustered count data with extra zeros," Sankhya B: The Indian Journal of Statistics, Springer;Indian Statistical Institute, vol. 83(2), pages 415-439, November.
    4. Kimberly F. Sellers & Andrew W. Swift & Kimberly S. Weems, 2017. "A flexible distribution class for count data," Journal of Statistical Distributions and Applications, Springer, vol. 4(1), pages 1-21, December.
    5. Rahma Abid & Célestin C. Kokonendji & Afif Masmoudi, 2021. "On Poisson-exponential-Tweedie models for ultra-overdispersed count data," AStA Advances in Statistical Analysis, Springer;German Statistical Society, vol. 105(1), pages 1-23, March.
    6. Daniel Rodriguez, 2023. "Assessing Area under the Curve as an Alternative to Latent Growth Curve Modeling for Repeated Measures Zero-Inflated Poisson Data: A Simulation Study," Stats, MDPI, vol. 6(1), pages 1-11, February.

    Most related items

    These are the items that most often cite the same works as this one and are cited by the same works as this one.
    1. John Haslett & Andrew C. Parnell & John Hinde & Rafael de Andrade Moral, 2022. "Modelling Excess Zeros in Count Data: A New Perspective on Modelling Approaches," International Statistical Review, International Statistical Institute, vol. 90(2), pages 216-236, August.
    2. Nguimkeu, Pierre & Denteh, Augustine & Tchernis, Rusty, 2019. "On the estimation of treatment effects with endogenous misreporting," Journal of Econometrics, Elsevier, vol. 208(2), pages 487-506.
    3. Luiz Paulo Fávero & Joseph F. Hair & Rafael de Freitas Souza & Matheus Albergaria & Talles V. Brugni, 2021. "Zero-Inflated Generalized Linear Mixed Models: A Better Way to Understand Data Relationships," Mathematics, MDPI, vol. 9(10), pages 1-28, May.
    4. Kocięcki, Andrzej & Kolasa, Marcin, 2023. "A solution to the global identification problem in DSGE models," Journal of Econometrics, Elsevier, vol. 236(2).
    5. Carvalho Lopes, Celia Mendes & Bolfarine, Heleno, 2012. "Random effects in promotion time cure rate models," Computational Statistics & Data Analysis, Elsevier, vol. 56(1), pages 75-87, January.
    6. Neusser, Klaus, 2016. "A topological view on the identification of structural vector autoregressions," Economics Letters, Elsevier, vol. 144(C), pages 107-111.
    7. Orazio Attanasio & Sarah Cattan & Emla Fitzsimons & Costas Meghir & Marta Rubio-Codina, 2020. "Estimating the Production Function for Human Capital: Results from a Randomized Controlled Trial in Colombia," American Economic Review, American Economic Association, vol. 110(1), pages 48-85, January.
    8. Chrysanthos Dellarocas & Charles A. Wood, 2008. "The Sound of Silence in Online Feedback: Estimating Trading Risks in the Presence of Reporting Bias," Management Science, INFORMS, vol. 54(3), pages 460-476, March.
    9. Gauss Cordeiro & Josemar Rodrigues & Mário Castro, 2012. "The exponential COM-Poisson distribution," Statistical Papers, Springer, vol. 53(3), pages 653-664, August.
    10. Xiaohong Chen & Victor Chernozhukov & Sokbae Lee & Whitney K. Newey, 2014. "Local Identification of Nonparametric and Semiparametric Models," Econometrica, Econometric Society, vol. 82(2), pages 785-809, March.
    11. Cho, Daegon & Hwang, Youngdeok & Park, Jongwon, 2018. "More buzz, more vibes: Impact of social media on concert distribution," Journal of Economic Behavior & Organization, Elsevier, vol. 156(C), pages 103-113.
    12. Greene, William, 2007. "Functional Form and Heterogeneity in Models for Count Data," Foundations and Trends(R) in Econometrics, now publishers, vol. 1(2), pages 113-218, August.
    13. Naimoli, Antonio & Storti, Giuseppe, 2019. "Heterogeneous component multiplicative error models for forecasting trading volumes," International Journal of Forecasting, Elsevier, vol. 35(4), pages 1332-1355.
    14. Daeyoung Kim & Bruce Lindsay, 2015. "Empirical identifiability in finite mixture models," Annals of the Institute of Statistical Mathematics, Springer;The Institute of Statistical Mathematics, vol. 67(4), pages 745-772, August.
    15. Andrew Chesher & Adam Rosen, 2015. "Characterizations of identified sets delivered by structural econometric models," CeMMAP working papers 63/15, Institute for Fiscal Studies.
    16. Mevin B. Hooten & Michael R. Schwob & Devin S. Johnson & Jacob S. Ivan, 2023. "Multistage hierarchical capture–recapture models," Environmetrics, John Wiley & Sons, Ltd., vol. 34(6), September.
    17. Can Zhou & Yan Jiao & Joan Browder, 2019. "How much do we know about seabird bycatch in pelagic longline fisheries? A simulation study on the potential bias caused by the usually unobserved portion of seabird bycatch," PLOS ONE, Public Library of Science, vol. 14(8), pages 1-19, August.
    18. Zirogiannis, Nikolaos & Tripodis, Yorghos, 2013. "A Generalized Dynamic Factor Model for Panel Data: Estimation with a Two-Cycle Conditional Expectation-Maximization Algorithm," Working Paper Series 142752, University of Massachusetts, Amherst, Department of Resource Economics.
    19. Das, Ujjwal & Das, Kalyan, 2018. "Inference on zero inflated ordinal models with semiparametric link," Computational Statistics & Data Analysis, Elsevier, vol. 128(C), pages 104-115.
    20. Gary Koop & M. Hashem Pesaran & Ron P. Smith, 2013. "On Identification of Bayesian DSGE Models," Journal of Business & Economic Statistics, Taylor & Francis Journals, vol. 31(3), pages 300-314, July.

    Corrections

    All material on this site has been provided by the respective publishers and authors. You can help correct errors and omissions. When requesting a correction, please mention this item's handle: RePEc:eee:csdana:v:99:y:2016:i:c:p:68-80. See general information about how to correct material in RePEc.

    If you have authored this item and are not yet registered with RePEc, we encourage you to do it here. This allows to link your profile to this item. It also allows you to accept potential citations to this item that we are uncertain about.

    If CitEc recognized a bibliographic reference but did not link an item in RePEc to it, you can help with this form .

    If you know of missing items citing this one, you can help us creating those links by adding the relevant references in the same way as above, for each refering item. If you are a registered author of this item, you may also want to check the "citations" tab in your RePEc Author Service profile, as there may be some citations waiting for confirmation.

    For technical questions regarding this item, or to correct its authors, title, abstract, bibliographic or download information, contact: Catherine Liu (email available below). General contact details of provider: http://www.elsevier.com/locate/csda .

    Please note that corrections may take a couple of weeks to filter through the various RePEc services.

    IDEAS is a RePEc service. RePEc uses bibliographic data supplied by the respective publishers.