IDEAS home Printed from https://ideas.repec.org/a/rnd/arjebs/v9y2017i3p6-18.html
   My bibliography  Save this article

The Effect of Sample Size on the Efficiency of Count Data Models: Application to Marriage Data

Author

Listed:
  • Volition Tlhalitshi Montshiwa
  • Ntebogang Dinah Moroke

Abstract

Sample size requirements are common in many multivariate analysis techniques as one of the measures taken to ensure the robustness of such techniques, such requirements have not been of interest in the area of count data models. As such, this study investigated the effect of sample size on the efficiency of six commonly used count data models namely: Poisson regression model (PRM), Negative binomial regression model (NBRM), Zero-inflated Poisson (ZIP), Zero-inflated negative binomial (ZINB), Poisson Hurdle model (PHM) and Negative binomial hurdle model (NBHM). The data used in this study were sourced from Data First and were collected by Statistics South Africa through the Marriage and Divorce database. PRM, NBRM, ZIP, ZINB, PHM and NBHM were applied to ten randomly selected samples ranging from 4392 to 43916 and differing by 10% in size. The six models were compared using the Akaike Information Criterion (AIC), Bayesian Information Criterion (BIC), Vuong’s test for over-dispersion, McFadden RSQ, Mean Square Error (MSE) and Mean Absolute Deviation (MAD).The results revealed that generally, the Negative Binomial-based models outperformed Poisson-based models. However, the results did not reveal the effect of sample size variations on the efficiency of the models since there was no consistency in the change in AIC, BIC, Vuong’s test for over-dispersion, McFadden RSQ, MSE and MAD as the sample size increased.

Suggested Citation

  • Volition Tlhalitshi Montshiwa & Ntebogang Dinah Moroke, 2017. "The Effect of Sample Size on the Efficiency of Count Data Models: Application to Marriage Data," Journal of Economics and Behavioral Studies, AMH International, vol. 9(3), pages 6-18.
  • Handle: RePEc:rnd:arjebs:v:9:y:2017:i:3:p:6-18
    DOI: 10.22610/jebs.v9i3(J).1742
    as

    Download full text from publisher

    File URL: https://ojs.amhinternational.com/index.php/jebs/article/view/1742/1440
    Download Restriction: no

    File URL: https://ojs.amhinternational.com/index.php/jebs/article/view/1742
    Download Restriction: no

    File URL: https://libkey.io/10.22610/jebs.v9i3(J).1742?utm_source=ideas
    LibKey link: if access is restricted and if your library uses this service, LibKey will redirect you to where you can use your library subscription to access this item
    ---><---

    References listed on IDEAS

    as
    1. Cameron,A. Colin & Trivedi,Pravin K., 2013. "Regression Analysis of Count Data," Cambridge Books, Cambridge University Press, number 9781107667273.
    2. Martijn Burger & Frank van Oort & Gert-Jan Linders, 2009. "On the Specification of the Gravity Model of Trade: Zeros, Excess Zeros and Zero-inflated Estimation," Spatial Economic Analysis, Taylor & Francis Journals, vol. 4(2), pages 167-190.
    3. Yundan Xiao & Xiongqing Zhang & Ping Ji, 2015. "Modeling Forest Fire Occurrences Using Count-Data Mixed Models in Qiannan Autonomous Prefecture of Guizhou Province in China," PLOS ONE, Public Library of Science, vol. 10(3), pages 1-12, March.
    4. Zeileis, Achim & Kleiber, Christian & Jackman, Simon, 2008. "Regression Models for Count Data in R," Journal of Statistical Software, Foundation for Open Access Statistics, vol. 27(i08).
    5. Yip, Karen C.H. & Yau, Kelvin K.W., 2005. "On modeling claim frequency data in general insurance with extra zeros," Insurance: Mathematics and Economics, Elsevier, vol. 36(2), pages 153-163, April.
    6. Fuzi, Mohd Fadzli Mohd & Jemain, Abdul Aziz & Ismail, Noriszura, 2016. "Bayesian quantile regression model for claim count data," Insurance: Mathematics and Economics, Elsevier, vol. 66(C), pages 124-137.
    Full references (including those not matched with items on IDEAS)

    Most related items

    These are the items that most often cite the same works as this one and are cited by the same works as this one.
    1. Taghouti, Ibtissem & Martinez-Gomez, Victor & Marti, Luisa, 2017. "Sanitary and Phytosanitary measures in agri-food imports from the European Union: Reputation effects over time," Economia Agraria y Recursos Naturales, Spanish Association of Agricultural Economists, vol. 16(02), January.
    2. Lorena Tudela-Marco & Jose Maria Garcia-Alvarez-Coque & Luisa Martí-Selva, 2017. "Do EU Member States Apply Food Standards Uniformly? A Look at Fruit and Vegetable Safety Notifications," Journal of Common Market Studies, Wiley Blackwell, vol. 55(2), pages 387-405, March.
    3. Christian Kleiber & Achim Zeileis, 2016. "Visualizing Count Data Regressions Using Rootograms," The American Statistician, Taylor & Francis Journals, vol. 70(3), pages 296-303, July.
    4. Drivas, Kyriakos & Economidou, Claire & Karamanis, Dimitrios & Sanders, Mark, 2020. "Mobility of highly skilled individuals and local innovation activity," Technological Forecasting and Social Change, Elsevier, vol. 158(C).
    5. Jose‐Maria Garcia‐Alvarez‐Coque & Ibtissem Taghouti & Victor Martinez‐Gomez, 2020. "Changes in Aflatoxin Standards: Implications for EU Border Controls of Nut Imports," Applied Economic Perspectives and Policy, John Wiley & Sons, vol. 42(3), pages 524-541, September.
    6. Jessie Bakens & Raymond J.G.M. Florax & Peter Mulder, 2018. "Ethnic drift and white flight: A gravity model of neighborhood formation," Journal of Regional Science, Wiley Blackwell, vol. 58(5), pages 921-948, November.
    7. Moritz Berger & Gerhard Tutz, 2021. "Transition models for count data: a flexible alternative to fixed distribution models," Statistical Methods & Applications, Springer;Società Italiana di Statistica, vol. 30(4), pages 1259-1283, October.
    8. Mihaela Covrig & Iulian Mircea & Gheorghita Zbaganu & Alexandru Coser & Alexandru Tindeche, 2015. "Using R In Generalized Linear Models," Romanian Statistical Review, Romanian Statistical Review, vol. 63(3), pages 33-45, September.
    9. Bach, Philipp & Farbmacher, Helmut & Spindler, Martin, 2018. "Semiparametric count data modeling with an application to health service demand," Econometrics and Statistics, Elsevier, vol. 8(C), pages 125-140.
    10. Taghouti, Ibtissem & Martinez-Gomez, Victor & Coque, José María Garcia Alvarez, 2015. "Exploring Eu Food Safety Notifications On Agro-Food Imports: Are Mediterranean Partner Countries Discriminated?," International Journal of Food and Agricultural Economics (IJFAEC), Alanya Alaaddin Keykubat University, Department of Economics and Finance, vol. 3(2), pages 1-15, April.
    11. Marjan Qazvini, 2019. "On the Validation of Claims with Excess Zeros in Liability Insurance: A Comparative Study," Risks, MDPI, vol. 7(3), pages 1-17, June.
    12. Feihong Xia, 2023. "Why to use Poisson regression for count data analysis in consumer behavior research," Journal of Marketing Analytics, Palgrave Macmillan, vol. 11(3), pages 379-384, September.
    13. Jennifer S. K. Chan & S. T. Boris Choy & Udi Makov & Ariel Shamir & Vered Shapovalov, 2022. "Variable Selection Algorithm for a Mixture of Poisson Regression for Handling Overdispersion in Claims Frequency Modeling Using Telematics Car Driving Data," Risks, MDPI, vol. 10(4), pages 1-10, April.
    14. Xuejun Jiang & Yunxian Li & Aijun Yang & Ruowei Zhou, 2020. "Bayesian semiparametric quantile regression modeling for estimating earthquake fatality risk," Empirical Economics, Springer, vol. 58(5), pages 2085-2103, May.
    15. Andrew Boutton & Vito D’Orazio, 2020. "Buying blue helmets: The role of foreign aid in the construction of UN peacekeeping missions," Journal of Peace Research, Peace Research Institute Oslo, vol. 57(2), pages 312-328, March.
    16. Thorsten Simon & Georg J. Mayr & Nikolaus Umlauf & Achim Zeileis, 2018. "Lightning Prediction Using Model Output Statistics," Working Papers 2018-14, Faculty of Economics and Statistics, Universität Innsbruck.
    17. Nan-Ting Liu & Feng-Chang Lin & Yu-Shan Shih, 2020. "Count regression trees," Advances in Data Analysis and Classification, Springer;German Classification Society - Gesellschaft für Klassifikation (GfKl);Japanese Classification Society (JCS);Classification and Data Analysis Group of the Italian Statistical Society (CLADAG);International Federation of Classification Societies (IFCS), vol. 14(1), pages 5-27, March.
    18. Mihaela COVRIG & Dumitru BADEA, 2017. "Some Generalized Linear Models for the Estimation of the Mean Frequency of Claims in Motor Insurance," ECONOMIC COMPUTATION AND ECONOMIC CYBERNETICS STUDIES AND RESEARCH, Faculty of Economic Cybernetics, Statistics and Informatics, vol. 51(4), pages 91-107.
    19. Chaocheng He & Jiang Wu & Qingpeng Zhang, 2020. "Research leadership flow determinants and the role of proximity in research collaborations," Journal of the Association for Information Science & Technology, Association for Information Science & Technology, vol. 71(11), pages 1341-1356, November.
    20. Antonio J. Sáez-Castillo & Antonio Conde-Sánchez, 2017. "Detecting over- and under-dispersion in zero inflated data with the hyper-Poisson regression model," Statistical Papers, Springer, vol. 58(1), pages 19-33, March.

    More about this item

    Statistics

    Access and download statistics

    Corrections

    All material on this site has been provided by the respective publishers and authors. You can help correct errors and omissions. When requesting a correction, please mention this item's handle: RePEc:rnd:arjebs:v:9:y:2017:i:3:p:6-18. See general information about how to correct material in RePEc.

    If you have authored this item and are not yet registered with RePEc, we encourage you to do it here. This allows to link your profile to this item. It also allows you to accept potential citations to this item that we are uncertain about.

    If CitEc recognized a bibliographic reference but did not link an item in RePEc to it, you can help with this form .

    If you know of missing items citing this one, you can help us creating those links by adding the relevant references in the same way as above, for each refering item. If you are a registered author of this item, you may also want to check the "citations" tab in your RePEc Author Service profile, as there may be some citations waiting for confirmation.

    For technical questions regarding this item, or to correct its authors, title, abstract, bibliographic or download information, contact: Muhammad Tayyab (email available below). General contact details of provider: https://ojs.amhinternational.com/index.php/jebs .

    Please note that corrections may take a couple of weeks to filter through the various RePEc services.

    IDEAS is a RePEc service. RePEc uses bibliographic data supplied by the respective publishers.