IDEAS home Printed from https://ideas.repec.org/a/gam/jmathe/v9y2021i5p555-d511924.html
   My bibliography  Save this article

On the Discretization of Continuous Probability Distributions Using a Probabilistic Rounding Mechanism

Author

Listed:
  • Chénangnon Frédéric Tovissodé

    (Laboratoire de Biomathématiques et d’Estimations Forestières, Université d’Abomey-Calavi, Abomey-Calavi, Benin)

  • Sèwanou Hermann Honfo

    (Laboratoire de Biomathématiques et d’Estimations Forestières, Université d’Abomey-Calavi, Abomey-Calavi, Benin
    These authors contributed equally to this work.)

  • Jonas Têlé Doumatè

    (Laboratoire de Biomathématiques et d’Estimations Forestières, Université d’Abomey-Calavi, Abomey-Calavi, Benin
    Faculté des Sciences et Techniques, Université d’Abomey-Calavi, Abomey-Calavi, Benin
    These authors contributed equally to this work.)

  • Romain Glèlè Kakaï

    (Laboratoire de Biomathématiques et d’Estimations Forestières, Université d’Abomey-Calavi, Abomey-Calavi, Benin)

Abstract

Most existing flexible count distributions allow only approximate inference when used in a regression context. This work proposes a new framework to provide an exact and flexible alternative for modeling and simulating count data with various types of dispersion (equi-, under-, and over-dispersion). The new method, referred to as “balanced discretization”, consists of discretizing continuous probability distributions while preserving expectations. It is easy to generate pseudo random variates from the resulting balanced discrete distribution since it has a simple stochastic representation (probabilistic rounding) in terms of the continuous distribution. For illustrative purposes, we develop the family of balanced discrete gamma distributions that can model equi-, under-, and over-dispersed count data. This family of count distributions is appropriate for building flexible count regression models because the expectation of the distribution has a simple expression in terms of the parameters of the distribution. Using the Jensen–Shannon divergence measure, we show that under the equidispersion restriction, the family of balanced discrete gamma distributions is similar to the Poisson distribution. Based on this, we conjecture that while covering all types of dispersions, a count regression model based on the balanced discrete gamma distribution will allow recovering a near Poisson distribution model fit when the data are Poisson distributed.

Suggested Citation

  • Chénangnon Frédéric Tovissodé & Sèwanou Hermann Honfo & Jonas Têlé Doumatè & Romain Glèlè Kakaï, 2021. "On the Discretization of Continuous Probability Distributions Using a Probabilistic Rounding Mechanism," Mathematics, MDPI, vol. 9(5), pages 1-17, March.
  • Handle: RePEc:gam:jmathe:v:9:y:2021:i:5:p:555-:d:511924
    as

    Download full text from publisher

    File URL: https://www.mdpi.com/2227-7390/9/5/555/pdf
    Download Restriction: no

    File URL: https://www.mdpi.com/2227-7390/9/5/555/
    Download Restriction: no
    ---><---

    References listed on IDEAS

    as
    1. Roy, D., 1993. "Reliability Measures in the Discrete Bivariate Set-Up and Related Characterization Results for a Bivariate Geometric Distribution," Journal of Multivariate Analysis, Elsevier, vol. 46(2), pages 362-373, August.
    2. Hagmark, Per-Erik, 2008. "On construction and simulation of count data models," Mathematics and Computers in Simulation (MATCOM), Elsevier, vol. 77(1), pages 72-80.
    3. Walmes Marques Zeviani & Paulo Justiniano Ribeiro & Wagner Hugo Bonat & Silvia Emiko Shimakura & Joel Augusto Muniz, 2014. "The Gamma-count distribution in the analysis of experimental underdispersed data," Journal of Applied Statistics, Taylor & Francis Journals, vol. 41(12), pages 2616-2626, December.
    4. Hagmark, Per-Erik, 2009. "A new concept for count distributions," Statistics & Probability Letters, Elsevier, vol. 79(8), pages 1120-1124, April.
    5. Veraart, Almut E.D., 2019. "Modeling, simulation and inference for multivariate time series of counts using trawl processes," Journal of Multivariate Analysis, Elsevier, vol. 169(C), pages 110-129.
    6. Cameron, A Colin & Johansson, Per, 1997. "Count Data Regression Using Series Expansions: With Applications," Journal of Applied Econometrics, John Wiley & Sons, Ltd., vol. 12(3), pages 203-223, May-June.
    Full references (including those not matched with items on IDEAS)

    Most related items

    These are the items that most often cite the same works as this one and are cited by the same works as this one.
    1. repec:ebl:ecbull:v:3:y:2008:i:42:p:1-13 is not listed on IDEAS
    2. Grzesiek, Aleksandra & Połoczański, Rafał & Kumar, Arun & Wyłomańska, Agnieszka, 2021. "Moment-based estimation for parameters of general inverse subordinator," Physica A: Statistical Mechanics and its Applications, Elsevier, vol. 575(C).
    3. van der Klaauw, Bas & Koning, Ruud H, 2003. "Testing the Normality Assumption in the Sample Selection Model with an Application to Travel Demand," Journal of Business & Economic Statistics, American Statistical Association, vol. 21(1), pages 31-42, January.
    4. Stefano Mainardi, 2003. "Testing convergence in life expectancies: count regression models on panel data," Prague Economic Papers, Prague University of Economics and Business, vol. 2003(4), pages 350-370.
    5. A. Colin Cameron & Per Johansson, 2004. "Bivariate Count Data Regression Using Series Expansions: With Applications," Working Papers 9815, University of California, Davis, Department of Economics.
    6. Fokianos, Konstantinos & Fried, Roland & Kharin, Yuriy & Voloshko, Valeriy, 2022. "Statistical analysis of multivariate discrete-valued time series," Journal of Multivariate Analysis, Elsevier, vol. 188(C).
    7. Jie Q. Guo & Pravin K. Trivedi, 2002. "Flexible Parametric Models for Long‐tailed Patent Count Distributions," Oxford Bulletin of Economics and Statistics, Department of Economics, University of Oxford, vol. 64(1), pages 63-82, February.
    8. Sarker, Rakhal & Surry, Yves R., 2003. "The Fast Decay Process In Recreational Demand Activities And The Use Of Alternative Count Data Models," Working Papers 34147, University of Guelph, Department of Food, Agricultural and Resource Economics.
    9. Subrata Chakraborty & S. H. Ong, 2017. "Mittag - Leffler function distribution - a new generalization of hyper-Poisson distribution," Journal of Statistical Distributions and Applications, Springer, vol. 4(1), pages 1-17, December.
    10. Harris, Matthew & Kohn, Jennifer, 2015. "Reference dependent utility from health and the demand for medical care," MPRA Paper 61926, University Library of Munich, Germany.
    11. Buddana Amrutha & Kozubowski Tomasz J., 2014. "Discrete Pareto Distributions," Stochastics and Quality Control, De Gruyter, vol. 29(2), pages 143-156, December.
    12. James E. Prieger, "undated". "A Generalized Parametric Selection Model for Non-Normal Data," Department of Economics 00-09, California Davis - Department of Economics.
    13. Marco Alfò & Giovanni Trovato, 2004. "Semiparametric Mixture Models for Multivariate Count Data, with Application," CEIS Research Paper 51, Tor Vergata University, CEIS.
    14. Bilal Ahmad Para & Tariq Rashid Jan, 2019. "On Three Parameter Discrete Generalized Inverse Weibull Distribution: Properties and Applications," Annals of Data Science, Springer, vol. 6(3), pages 549-570, September.
    15. Smith, David M. & Faddy, Malcolm J., 2016. "Mean and Variance Modeling of Under- and Overdispersed Count Data," Journal of Statistical Software, Foundation for Open Access Statistics, vol. 69(i06).
    16. Dilip Roy, 2002. "On Bivariate Lack of Memory Property and a New Definition," Annals of the Institute of Statistical Mathematics, Springer;The Institute of Statistical Mathematics, vol. 54(2), pages 404-410, June.
    17. Eduardo Fé & Richard Hofler, 2013. "Count data stochastic frontier models, with an application to the patents–R&D relationship," Journal of Productivity Analysis, Springer, vol. 39(3), pages 271-284, June.
    18. Bauer, Thomas K. & Million, Andreas & Rotte, Ralph & Zimmermann, Klaus F., 1998. "Immigration Labor and Workplace Safety," IZA Discussion Papers 16, Institute of Labor Economics (IZA).
    19. Andrés Romeu & Marcos Vera-Hernández, 2005. "Counts with an endogenous binary regressor: A series expansion approach," Econometrics Journal, Royal Economic Society, vol. 8(1), pages 1-22, March.
    20. Marcelo Bourguignon & Diego I. Gallardo & Rodrigo M. R. Medeiros, 2022. "A simple and useful regression model for underdispersed count data based on Bernoulli–Poisson convolution," Statistical Papers, Springer, vol. 63(3), pages 821-848, June.
    21. Adeniyi, Isaac Adeola, 2020. "Bayesian Generalized Linear Mixed Effects Models Using Normal-Independent Distributions: Formulation and Applications," MPRA Paper 99165, University Library of Munich, Germany.

    Corrections

    All material on this site has been provided by the respective publishers and authors. You can help correct errors and omissions. When requesting a correction, please mention this item's handle: RePEc:gam:jmathe:v:9:y:2021:i:5:p:555-:d:511924. See general information about how to correct material in RePEc.

    If you have authored this item and are not yet registered with RePEc, we encourage you to do it here. This allows to link your profile to this item. It also allows you to accept potential citations to this item that we are uncertain about.

    If CitEc recognized a bibliographic reference but did not link an item in RePEc to it, you can help with this form .

    If you know of missing items citing this one, you can help us creating those links by adding the relevant references in the same way as above, for each refering item. If you are a registered author of this item, you may also want to check the "citations" tab in your RePEc Author Service profile, as there may be some citations waiting for confirmation.

    For technical questions regarding this item, or to correct its authors, title, abstract, bibliographic or download information, contact: MDPI Indexing Manager (email available below). General contact details of provider: https://www.mdpi.com .

    Please note that corrections may take a couple of weeks to filter through the various RePEc services.

    IDEAS is a RePEc service. RePEc uses bibliographic data supplied by the respective publishers.