IDEAS home Printed from https://ideas.repec.org/a/spr/stpapr/v58y2017i3d10.1007_s00362-015-0724-9.html
   My bibliography  Save this article

A regression model for overdispersed data without too many zeros

Author

Listed:
  • José Rodríguez-Avi

    (University of Jaén)

  • María José Olmo-Jiménez

    (University of Jaén)

Abstract

A regression model for overdispersed count data based on the complex biparametric Pearson (CBP) distribution is developed. It is compared with the generalized Poisson regression model, the negative binomial regression model and the zero inflated Poisson regression model, which are based on the generalized Poisson (CBP), negative binomial (NB) and zero inflated Poisson (ZIP) distributions, respectively. It is shown that the CBP distribution is more adequate than the GP, NB and ZIP distributions when the overdispersion is not related to a higher frequency of 0, but to other low values greater than 0, so it may be appropriate for overdispersed cases in which there are external reasons that raise the number of low values different from 0. Firstly, we study the shape and the parameters of the CBP distribution and we compare it with the Poisson, GP, NB and ZIP distributions by means of the probability of 0, the skewness and curtosis coefficients and the Kullback–Leibler divergence. Furthermore, we present an application example where the aforementioned performance is shown by the number of public educational facilities by municipality in Andalusia (Spain). Secondly, we describe two regression models based on the CBP distribution and the estimation method for their parameters. Thirdly, we carry out a simulation study that reveals the performance of the regression models proposed. Finally, one application in the field of sport illustrates that these models can provide more accurate fits than those provided by other usual regression models for count data.

Suggested Citation

  • José Rodríguez-Avi & María José Olmo-Jiménez, 2017. "A regression model for overdispersed data without too many zeros," Statistical Papers, Springer, vol. 58(3), pages 749-773, September.
  • Handle: RePEc:spr:stpapr:v:58:y:2017:i:3:d:10.1007_s00362-015-0724-9
    DOI: 10.1007/s00362-015-0724-9
    as

    Download full text from publisher

    File URL: http://link.springer.com/10.1007/s00362-015-0724-9
    File Function: Abstract
    Download Restriction: Access to the full text of the articles in this series is restricted.

    File URL: https://libkey.io/10.1007/s00362-015-0724-9?utm_source=ideas
    LibKey link: if access is restricted and if your library uses this service, LibKey will redirect you to where you can use your library subscription to access this item
    ---><---

    As the access to this document is restricted, you may want to search for a different version of it.

    References listed on IDEAS

    as
    1. Cameron,A. Colin & Trivedi,Pravin K., 2013. "Regression Analysis of Count Data," Cambridge Books, Cambridge University Press, number 9781107667273, January.
    2. Feng-Chang Xie & Jin-Guan Lin & Bo-Cheng Wei, 2014. "Bayesian zero-inflated generalized Poisson regression model: estimation and case influence diagnostics," Journal of Applied Statistics, Taylor & Francis Journals, vol. 41(6), pages 1383-1392, June.
    3. Mullahy, John, 1986. "Specification and testing of some modified count data models," Journal of Econometrics, Elsevier, vol. 33(3), pages 341-365, December.
    4. Rainer Winkelmann, 2008. "Econometric Analysis of Count Data," Springer Books, Springer, edition 0, number 978-3-540-78389-3, September.
    5. Rigby, R.A. & Stasinopoulos, D.M. & Akantziliotou, C., 2008. "A framework for modelling overdispersed count data, including the Poisson-shifted generalized inverse Gaussian distribution," Computational Statistics & Data Analysis, Elsevier, vol. 53(2), pages 381-393, December.
    6. K. Poortema, 1999. "On modelling overdispersion of counts," Statistica Neerlandica, Netherlands Society for Statistics and Operations Research, vol. 53(1), pages 5-20, March.
    7. Hinde, John & Demetrio, Clarice G. B., 1998. "Overdispersion: Models and estimation," Computational Statistics & Data Analysis, Elsevier, vol. 27(2), pages 151-170, April.
    8. Cordeiro, Gauss M. & Andrade, Marinho G. & de Castro, Mário, 2009. "Power series generalized nonlinear models," Computational Statistics & Data Analysis, Elsevier, vol. 53(4), pages 1155-1166, February.
    9. Ajiferuke, Isola & Famoye, Felix, 2015. "Modelling count response variables in informetric studies: Comparison among count, linear, and lognormal regression models," Journal of Informetrics, Elsevier, vol. 9(3), pages 499-513.
    10. Hossein Zamani & Noriszura Ismail, 2013. "Score test for testing zero-inflated Poisson regression against zero-inflated generalized Poisson alternatives," Journal of Applied Statistics, Taylor & Francis Journals, vol. 40(9), pages 2056-2068, September.
    Full references (including those not matched with items on IDEAS)

    Most related items

    These are the items that most often cite the same works as this one and are cited by the same works as this one.
    1. Rodríguez-Avi, J. & Conde-Sánchez, A. & Sáez-Castillo, A.J. & Olmo-Jiménez, M.J. & Martínez-Rodríguez, A.M., 2009. "A generalized Waring regression model for count data," Computational Statistics & Data Analysis, Elsevier, vol. 53(10), pages 3717-3725, August.
    2. Rainer Winkelmann, 2015. "Counting on count data models," IZA World of Labor, Institute of Labor Economics (IZA), pages 148-148, May.
    3. Joan Costa-Font & Sergi Jiménez-Martín & Cristina Vilaplana, 2016. "Does long-term care subsidisation reduce unnecessary hospitalisations?," Economics Working Papers 1535, Department of Economics and Business, Universitat Pompeu Fabra.
    4. Ana María Martínez-Rodríguez & Antonio Conde-Sánchez & María José Olmo-Jiménez, 2019. "A new approach to truncated regression for count data," AStA Advances in Statistical Analysis, Springer;German Statistical Society, vol. 103(4), pages 503-526, December.
    5. Moritz Berger & Gerhard Tutz, 2021. "Transition models for count data: a flexible alternative to fixed distribution models," Statistical Methods & Applications, Springer;Società Italiana di Statistica, vol. 30(4), pages 1259-1283, October.
    6. Costa-Font, Joan & Jiménez-Martínez, Sergi & Vilaplana, Cristina, 2016. "Does long-term care subsidisation reduce hospital admissions?," LSE Research Online Documents on Economics 67911, London School of Economics and Political Science, LSE Library.
    7. Costa-Font, Joan & Jimenez-Martin, Sergi & Vilaplana, Cristina, 2018. "Does long-term care subsidization reduce hospital admissions and utilization?," Journal of Health Economics, Elsevier, vol. 58(C), pages 43-66.
    8. Gregori Baetschmann & Rainer Winkelmann, 2014. "A dynamic hurdle model for zero-inflated count data: with an application to health care utilization," ECON - Working Papers 151, Department of Economics - University of Zurich.
    9. Katiane S. Conceição & Marinho G. Andrade & Francisco Louzada & Nalini Ravishanker, 2022. "Characterizations and generalizations of the negative binomial distribution," Computational Statistics, Springer, vol. 37(3), pages 1255-1286, July.
    10. Bono, Pierre-Henri & David, Quentin & Desbordes, Rodolphe & Py, Loriane, 2022. "Metro infrastructure and metropolitan attractiveness," Regional Science and Urban Economics, Elsevier, vol. 93(C).
    11. Christian Kleiber & Achim Zeileis, 2016. "Visualizing Count Data Regressions Using Rootograms," The American Statistician, Taylor & Francis Journals, vol. 70(3), pages 296-303, July.
    12. J. M. C. Santos Silva & Silvana Tenreyro, 2022. "The Log of Gravity at 15," Portuguese Economic Journal, Springer;Instituto Superior de Economia e Gestao, vol. 21(3), pages 423-437, September.
    13. Chiara Bocci & Laura Grassini & Emilia Rocco, 2021. "A multiple inflated negative binomial hurdle regression model: analysis of the Italians’ tourism behaviour during the Great Recession," Statistical Methods & Applications, Springer;Società Italiana di Statistica, vol. 30(4), pages 1109-1133, October.
    14. Soutik Ghosal & Timothy S. Lau & Jeremy Gaskins & Maiying Kong, 2020. "A hierarchical mixed effect hurdle model for spatiotemporal count data and its application to identifying factors impacting health professional shortages," Journal of the Royal Statistical Society Series C, Royal Statistical Society, vol. 69(5), pages 1121-1144, November.
    15. Sarni Maniar Berliana & Purhadi & Sutikno & Santi Puteri Rahayu, 2020. "Parameter Estimation and Hypothesis Testing of Geographically Weighted Multivariate Generalized Poisson Regression," Mathematics, MDPI, vol. 8(9), pages 1-14, September.
    16. Lluís Bermúdez & Dimitris Karlis & Isabel Morillo, 2020. "Modelling Unobserved Heterogeneity in Claim Counts Using Finite Mixture Models," Risks, MDPI, vol. 8(1), pages 1-13, January.
    17. Jiang, Yuan & House, Lisa A., 2017. "Comparison of the Performance of Count Data Models under Different Zero-Inflation Scenarios Using Simulation Studies," 2017 Annual Meeting, July 30-August 1, Chicago, Illinois 258342, Agricultural and Applied Economics Association.
    18. Koppenberg, Maximilian & Mishra, Ashok K. & Hirsch, Stefan, 2023. "Food Aid and Violent Conflict: A Review of Literature," IZA Discussion Papers 16574, Institute of Labor Economics (IZA).
    19. José M. R. Murteira & Mário A. G. Augusto, 2017. "Hurdle models of repayment behaviour in personal loan contracts," Empirical Economics, Springer, vol. 53(2), pages 641-667, September.
    20. Joan Costa‐Font & Cristina Vilaplana‐Prieto, 2020. "‘More than one red herring'? Heterogeneous effects of ageing on health care utilisation," Health Economics, John Wiley & Sons, Ltd., vol. 29(S1), pages 8-29, October.

    Corrections

    All material on this site has been provided by the respective publishers and authors. You can help correct errors and omissions. When requesting a correction, please mention this item's handle: RePEc:spr:stpapr:v:58:y:2017:i:3:d:10.1007_s00362-015-0724-9. See general information about how to correct material in RePEc.

    If you have authored this item and are not yet registered with RePEc, we encourage you to do it here. This allows to link your profile to this item. It also allows you to accept potential citations to this item that we are uncertain about.

    If CitEc recognized a bibliographic reference but did not link an item in RePEc to it, you can help with this form .

    If you know of missing items citing this one, you can help us creating those links by adding the relevant references in the same way as above, for each refering item. If you are a registered author of this item, you may also want to check the "citations" tab in your RePEc Author Service profile, as there may be some citations waiting for confirmation.

    For technical questions regarding this item, or to correct its authors, title, abstract, bibliographic or download information, contact: Sonal Shukla or Springer Nature Abstracting and Indexing (email available below). General contact details of provider: http://www.springer.com .

    Please note that corrections may take a couple of weeks to filter through the various RePEc services.

    IDEAS is a RePEc service. RePEc uses bibliographic data supplied by the respective publishers.