IDEAS home Printed from https://ideas.repec.org/a/gam/jstats/v4y2021i2p21-326d536861.html
   My bibliography  Save this article

A Flexible Multivariate Distribution for Correlated Count Data

Author

Listed:
  • Kimberly F. Sellers

    (Department of Mathematics and Statistics, Georgetown University, Washington, DC 20057, USA
    Center for Statistical Research and Methodology, U. S. Census Bureau, Washington, DC 20233, USA)

  • Tong Li

    (Department of Mathematics and Statistics, Georgetown University, Washington, DC 20057, USA)

  • Yixuan Wu

    (Department of Mathematics and Statistics, Georgetown University, Washington, DC 20057, USA)

  • Narayanaswamy Balakrishnan

    (Department of Mathematics and Statistics, McMaster University, Hamilton, ON L8S 4K1, Canada)

Abstract

Multivariate count data are often modeled via a multivariate Poisson distribution, but it contains an underlying, constraining assumption of data equi-dispersion (where its variance equals its mean). Real data are oftentimes over-dispersed and, as such, consider various advancements of a negative binomial structure. While data over-dispersion is more prevalent than under-dispersion in real data, however, examples containing under-dispersed data are surfacing with greater frequency. Thus, there is a demonstrated need for a flexible model that can accommodate both data types. We develop a multivariate Conway–Maxwell–Poisson (MCMP) distribution to serve as a flexible alternative for correlated count data that contain data dispersion. This structure contains the multivariate Poisson, multivariate geometric, and the multivariate Bernoulli distributions as special cases, and serves as a bridge distribution across these three classical models to address other levels of over- or under-dispersion. In this work, we not only derive the distributional form and statistical properties of this model, but we further address parameter estimation, establish informative hypothesis tests to detect statistically significant data dispersion and aid in model parsimony, and illustrate the distribution’s flexibility through several simulated and real-world data examples. These examples demonstrate that the MCMP distribution performs on par with the multivariate negative binomial distribution for over-dispersed data, and proves particularly beneficial in effectively representing under-dispersed data. Thus, the MCMP distribution offers an effective, unifying framework for modeling over- or under-dispersed multivariate correlated count data that do not necessarily adhere to Poisson assumptions.

Suggested Citation

  • Kimberly F. Sellers & Tong Li & Yixuan Wu & Narayanaswamy Balakrishnan, 2021. "A Flexible Multivariate Distribution for Correlated Count Data," Stats, MDPI, vol. 4(2), pages 1-19, April.
  • Handle: RePEc:gam:jstats:v:4:y:2021:i:2:p:21-326:d:536861
    as

    Download full text from publisher

    File URL: https://www.mdpi.com/2571-905X/4/2/21/pdf
    Download Restriction: no

    File URL: https://www.mdpi.com/2571-905X/4/2/21/
    Download Restriction: no
    ---><---

    References listed on IDEAS

    as
    1. Sellers, Kimberly F. & Morris, Darcy Steeg & Balakrishnan, Narayanaswamy, 2016. "Bivariate Conway–Maxwell–Poisson distribution: Formulation, properties, and inference," Journal of Multivariate Analysis, Elsevier, vol. 150(C), pages 152-168.
    2. Hilbe,Joseph M., 2014. "Modeling Count Data," Cambridge Books, Cambridge University Press, number 9781107611252.
    3. Seth D. Guikema & Jeremy P. Goffelt, 2008. "A Flexible Count Data Regression Model for Risk Analysis," Risk Analysis, John Wiley & Sons, vol. 28(1), pages 213-223, February.
    4. Galit Shmueli & Thomas P. Minka & Joseph B. Kadane & Sharad Borle & Peter Boatwright, 2005. "A useful distribution for fitting discrete data: revival of the Conway–Maxwell–Poisson distribution," Journal of the Royal Statistical Society Series C, Royal Statistical Society, vol. 54(1), pages 127-142, January.
    5. Genest, Christian & Nešlehová, Johanna, 2007. "A Primer on Copulas for Count Data," ASTIN Bulletin, Cambridge University Press, vol. 37(2), pages 475-515, November.
    6. Balakrishnan, N. & Pal, Suvra, 2013. "Lognormal lifetimes and likelihood-based inference for flexible cure rate models based on COM-Poisson family," Computational Statistics & Data Analysis, Elsevier, vol. 67(C), pages 41-67.
    7. Pravin Trivedi & David Zimmer, 2017. "A Note on Identification of Bivariate Copulas for Discrete Count Data," Econometrics, MDPI, vol. 5(1), pages 1-11, February.
    8. Doss, D. C., 1979. "Definition and characterization of multivariate negative binomial distribution," Journal of Multivariate Analysis, Elsevier, vol. 9(3), pages 460-464, September.
    9. Kimberly F. Sellers & Andrew W. Swift & Kimberly S. Weems, 2017. "A flexible distribution class for count data," Journal of Statistical Distributions and Applications, Springer, vol. 4(1), pages 1-21, December.
    Full references (including those not matched with items on IDEAS)

    Most related items

    These are the items that most often cite the same works as this one and are cited by the same works as this one.
    1. Kimberly F. Sellers & Ali Arab & Sean Melville & Fanyu Cui, 2021. "A flexible univariate moving average time-series model for dispersed count data," Journal of Statistical Distributions and Applications, Springer, vol. 8(1), pages 1-12, December.
    2. Morris, Darcy Steeg & Raim, Andrew M. & Sellers, Kimberly F., 2020. "A Conway–Maxwell-multinomial distribution for flexible modeling of clustered categorical data," Journal of Multivariate Analysis, Elsevier, vol. 179(C).
    3. Mamode Khan Naushad & Rumjaun Wasseem & Sunecher Yuvraj & Jowaheer Vandna, 2017. "Computing with bivariate COM-Poisson model under different copulas," Monte Carlo Methods and Applications, De Gruyter, vol. 23(2), pages 131-146, June.
    4. Darcy Steeg Morris & Kimberly F. Sellers, 2022. "A Flexible Mixed Model for Clustered Count Data," Stats, MDPI, vol. 5(1), pages 1-18, January.
    5. Gery Geenens, 2024. "(Re-)Reading Sklar (1959)—A Personal View on Sklar’s Theorem," Mathematics, MDPI, vol. 12(3), pages 1-7, January.
    6. Kimberly F. Sellers & Andrew W. Swift & Kimberly S. Weems, 2017. "A flexible distribution class for count data," Journal of Statistical Distributions and Applications, Springer, vol. 4(1), pages 1-21, December.
    7. Fantazzini, Dean, 2020. "Discussing copulas with Sergey Aivazian: a memoir," MPRA Paper 102317, University Library of Munich, Germany.
    8. Zhou, Can & Jiao, Yan & Browder, Joan, 2019. "K-aggregated transformation of discrete distributions improves modeling count data with excess ones," Ecological Modelling, Elsevier, vol. 407(C), pages 1-1.
    9. Darolles, Serge & Fol, Gaëlle Le & Lu, Yang & Sun, Ran, 2019. "Bivariate integer-autoregressive process with an application to mutual fund flows," Journal of Multivariate Analysis, Elsevier, vol. 173(C), pages 181-203.
    10. Mauro Laudicella & Paolo Li Donni, 2022. "The dynamic interdependence in the demand of primary and emergency secondary care: A hidden Markov approach," Journal of Applied Econometrics, John Wiley & Sons, Ltd., vol. 37(3), pages 521-536, April.
    11. Suvra Pal & Jacob Majakwara & N. Balakrishnan, 2018. "An EM algorithm for the destructive COM-Poisson regression cure rate model," Metrika: International Journal for Theoretical and Applied Statistics, Springer, vol. 81(2), pages 143-171, February.
    12. Dominique Lord & Srinivas Reddy Geedipally & Seth D. Guikema, 2010. "Extension of the Application of Conway‐Maxwell‐Poisson Models: Analyzing Traffic Crash Data Exhibiting Underdispersion," Risk Analysis, John Wiley & Sons, vol. 30(8), pages 1268-1276, August.
    13. Sellers, Kimberly F. & Morris, Darcy Steeg & Balakrishnan, Narayanaswamy, 2016. "Bivariate Conway–Maxwell–Poisson distribution: Formulation, properties, and inference," Journal of Multivariate Analysis, Elsevier, vol. 150(C), pages 152-168.
    14. S. Hadi Khazraee & Antonio Jose Sáez‐Castillo & Srinivas Reddy Geedipally & Dominique Lord, 2015. "Application of the Hyper‐Poisson Generalized Linear Model for Analyzing Motor Vehicle Crashes," Risk Analysis, John Wiley & Sons, vol. 35(5), pages 919-930, May.
    15. Royce A. Francis & Srinivas Reddy Geedipally & Seth D. Guikema & Soma Sekhar Dhavala & Dominique Lord & Sarah LaRocca, 2012. "Characterizing the Performance of the Conway‐Maxwell Poisson Generalized Linear Model," Risk Analysis, John Wiley & Sons, vol. 32(1), pages 167-183, January.
    16. Rufin Bidounga & Evrand Giles Brunel Mandangui Maloumbi & Réolie Foxie Mizélé Kitoti & Dominique Mizère, 2020. "The New Bivariate Conway-Maxwell-Poisson Distribution Obtained by the Crossing Method," International Journal of Statistics and Probability, Canadian Center of Science and Education, vol. 9(6), pages 1-1, November.
    17. Seng Huat Ong & Shin Zhu Sim & Shuangzhe Liu & Hari M. Srivastava, 2023. "A Family of Finite Mixture Distributions for Modelling Dispersion in Count Data," Stats, MDPI, vol. 6(3), pages 1-14, September.
    18. Veraart, Almut E.D., 2019. "Modeling, simulation and inference for multivariate time series of counts using trawl processes," Journal of Multivariate Analysis, Elsevier, vol. 169(C), pages 110-129.
    19. Mathews Joseph & Bhattacharya Sumangal & Das Ishapathik & Sen Sumen, 2022. "Multiple inflated negative binomial regression for correlated multivariate count data," Dependence Modeling, De Gruyter, vol. 10(1), pages 290-307, January.
    20. Giampiero Marra & Rosalba Radice & David Zimmer, 2021. "Did the ACA's “guaranteed issue” provision cause adverse selection into nongroup insurance? Analysis using a copula‐based hurdle model," Health Economics, John Wiley & Sons, Ltd., vol. 30(9), pages 2246-2263, September.

    Corrections

    All material on this site has been provided by the respective publishers and authors. You can help correct errors and omissions. When requesting a correction, please mention this item's handle: RePEc:gam:jstats:v:4:y:2021:i:2:p:21-326:d:536861. See general information about how to correct material in RePEc.

    If you have authored this item and are not yet registered with RePEc, we encourage you to do it here. This allows to link your profile to this item. It also allows you to accept potential citations to this item that we are uncertain about.

    If CitEc recognized a bibliographic reference but did not link an item in RePEc to it, you can help with this form .

    If you know of missing items citing this one, you can help us creating those links by adding the relevant references in the same way as above, for each refering item. If you are a registered author of this item, you may also want to check the "citations" tab in your RePEc Author Service profile, as there may be some citations waiting for confirmation.

    For technical questions regarding this item, or to correct its authors, title, abstract, bibliographic or download information, contact: MDPI Indexing Manager (email available below). General contact details of provider: https://www.mdpi.com .

    Please note that corrections may take a couple of weeks to filter through the various RePEc services.

    IDEAS is a RePEc service. RePEc uses bibliographic data supplied by the respective publishers.