IDEAS home Printed from https://ideas.repec.org/a/eee/infome/v9y2015i3p499-513.html
   My bibliography  Save this article

Modelling count response variables in informetric studies: Comparison among count, linear, and lognormal regression models

Author

Listed:
  • Ajiferuke, Isola
  • Famoye, Felix

Abstract

The purpose of the study is to compare the performance of count regression models to those of linear and lognormal regression models in modelling count response variables in informetric studies. Identified count response variables in informetric studies include the number of authors, the number of references, the number of views, the number of downloads, and the number of citations received by an article. Also of a count nature are the number of links from and to a website. Data were collected from the United States Patent and Trademark Office (www.uspto.gov), an open access journal (www.informationr.net/ir/), Web of Science, and Maclean's magazine. The datasets were then used to compare the performance of linear and lognormal regression models with those of Poisson, negative binomial, and generalized Poisson regression models. It was found that due to over-dispersion in most response variables, the negative binomial regression model often seems to be more appropriate for informetric datasets than the Poisson and generalized Poisson regression models. Also, the regression analyses showed that linear regression model predicted some negative values for five of the nine response variables modelled, and for all the response variables, it performed worse than both the negative binomial and lognormal regression models when either Akaike's Information Criterion (AIC) or Bayesian Information Criterion (BIC) was used as the measure of goodness of fit statistics. The negative binomial regression model performed significantly better than the lognormal regression model for four of the response variables while the lognormal regression model performed significantly better than the negative binomial regression model for two of the response variables but there was no significant difference in the performance of the two models for the remaining three response variables.

Suggested Citation

  • Ajiferuke, Isola & Famoye, Felix, 2015. "Modelling count response variables in informetric studies: Comparison among count, linear, and lognormal regression models," Journal of Informetrics, Elsevier, vol. 9(3), pages 499-513.
  • Handle: RePEc:eee:infome:v:9:y:2015:i:3:p:499-513
    DOI: 10.1016/j.joi.2015.05.001
    as

    Download full text from publisher

    File URL: http://www.sciencedirect.com/science/article/pii/S1751157715000498
    Download Restriction: Full text for ScienceDirect subscribers only

    File URL: https://libkey.io/10.1016/j.joi.2015.05.001?utm_source=ideas
    LibKey link: if access is restricted and if your library uses this service, LibKey will redirect you to where you can use your library subscription to access this item
    ---><---

    As the access to this document is restricted, you may want to search for a different version of it.

    References listed on IDEAS

    as
    1. H. P. F. Peters & A. F. J. van Raan, 1994. "On determinants of citation scores: A case study in chemical engineering," Journal of the American Society for Information Science, Association for Information Science & Technology, vol. 45(1), pages 39-49, January.
    2. Glenn D. Walters, 2006. "Predicting subsequent citations to articles published in twelve crime-psychology journals: Author impact versus journal impact," Scientometrics, Springer;Akadémiai Kiadó, vol. 69(3), pages 499-510, December.
    3. Henk F. Moed, 2005. "Statistical relationships between downloads and citations at the level of individual documents within a single journal," Journal of the American Society for Information Science and Technology, Association for Information Science & Technology, vol. 56(10), pages 1088-1097, August.
    4. Mike Thelwall & Nabeil Maflahi, 2015. "Are scholarly articles disproportionately read in their own country? An analysis of mendeley readers," Journal of the Association for Information Science & Technology, Association for Information Science & Technology, vol. 66(6), pages 1124-1135, June.
    5. Mullahy, John, 1986. "Specification and testing of some modified count data models," Journal of Econometrics, Elsevier, vol. 33(3), pages 341-365, December.
    6. Chaomei Chen, 2012. "Predictive effects of structural variation on citation counts," Journal of the American Society for Information Science and Technology, Association for Information Science & Technology, vol. 63(3), pages 431-449, March.
    7. Claudia Czado & Holger Schabenberger & Vinzenz Erhardt, 2014. "Non nested model selection for spatial count regression models with application to health insurance," Statistical Papers, Springer, vol. 55(2), pages 455-476, May.
    8. Christian Schloegl & Juan Gorraiz, 2010. "Comparison of citation and usage indicators: the case of oncology journals," Scientometrics, Springer;Akadémiai Kiadó, vol. 82(3), pages 567-580, March.
    9. Landes, William M & Posner, Richard A, 2000. "Citations, Age, Fame, and the Web," The Journal of Legal Studies, University of Chicago Press, vol. 29(1), pages 319-344, January.
    10. Ayres, Ian & Vars, Fredrick E, 2000. "Determinants of Citations to Articles in Elite Law Reviews," The Journal of Legal Studies, University of Chicago Press, vol. 29(1), pages 427-450, January.
    11. John D. McDonald, 2007. "Understanding journal usage: A statistical analysis of citation and use," Journal of the American Society for Information Science and Technology, Association for Information Science & Technology, vol. 58(1), pages 39-50, January.
    12. Erda Wang & Zuozhi Li & Bertis B. Little & Yu Yang, 2009. "The Economic Impact of Tourism in Xinghai Park, China: A Travel Cost Value Analysis Using Count Data Regression Models," Tourism Economics, , vol. 15(2), pages 413-425, June.
    13. Christian Schlögl & Juan Gorraiz & Christian Gumpenberger & Kris Jack & Peter Kraker, 2014. "Comparison of downloads, citations and readership data for two information systems journals," Scientometrics, Springer;Akadémiai Kiadó, vol. 101(2), pages 1113-1128, November.
    14. Yong-Gil Lee & Jeong-Dong Lee & Yong-Il Song & Se-Jun Lee, 2007. "An in-depth empirical analysis of patent citation counts using zero-inflated count data model: The case of KIST," Scientometrics, Springer;Akadémiai Kiadó, vol. 70(1), pages 27-39, January.
    15. Yassine Gargouri & Chawki Hajjem & Vincent Larivière & Yves Gingras & Les Carr & Tim Brody & Stevan Harnad, 2010. "Self-Selected or Mandated, Open Access Increases Citation Impact for Higher Quality Research," PLOS ONE, Public Library of Science, vol. 5(10), pages 1-12, October.
    16. Lutz Bornmann & Hans‐Dieter Daniel, 2007. "Multiple publication on a single research study: Does it pay? The influence of number of research articles on total citation counts in biomedicine," Journal of the American Society for Information Science and Technology, Association for Information Science & Technology, vol. 58(8), pages 1100-1107, June.
    17. Fuyuki Yoshikane, 2013. "Multiple regression analysis of a patent’s citation frequency and quantitative characteristics: the case of Japanese patents," Scientometrics, Springer;Akadémiai Kiadó, vol. 96(1), pages 365-379, July.
    18. Hamid R. Jamali & Mahsa Nikzad, 2011. "Article title type and its relation with the number of downloads and citations," Scientometrics, Springer;Akadémiai Kiadó, vol. 88(2), pages 653-661, August.
    19. Fereshteh Didegah & Mike Thelwall, 2013. "Determinants of research citation impact in nanoscience and nanotechnology," Journal of the Association for Information Science & Technology, Association for Information Science & Technology, vol. 64(5), pages 1055-1064, May.
    20. Vuong, Quang H, 1989. "Likelihood Ratio Tests for Model Selection and Non-nested Hypotheses," Econometrica, Econometric Society, vol. 57(2), pages 307-333, March.
    21. Buter, R.K. & van Raan, A.F.J., 2011. "Non-alphanumeric characters in titles of scientific publications: An analysis of their occurrence and correlation with citation impact," Journal of Informetrics, Elsevier, vol. 5(4), pages 608-617.
    22. Chaomei Chen, 2012. "Predictive effects of structural variation on citation counts," Journal of the Association for Information Science & Technology, Association for Information Science & Technology, vol. 63(3), pages 431-449, March.
    23. A. Baccini & L. Barabesi & M. Cioni & C. Pisani, 2014. "Crossing the hurdle: the determinants of individual scientific performance," Scientometrics, Springer;Akadémiai Kiadó, vol. 101(3), pages 2035-2062, December.
    24. Fereshteh Didegah & Mike Thelwall, 2013. "Determinants of research citation impact in nanoscience and nanotechnology," Journal of the American Society for Information Science and Technology, Association for Information Science & Technology, vol. 64(5), pages 1055-1064, May.
    25. Thelwall, Mike & Wilson, Paul, 2014. "Regression for citation data: An evaluation of different methods," Journal of Informetrics, Elsevier, vol. 8(4), pages 963-971.
    26. Tian Yu & Guang Yu & Peng-Yu Li & Liang Wang, 2014. "Citation impact prediction for scientific papers using stepwise regression analysis," Scientometrics, Springer;Akadémiai Kiadó, vol. 101(2), pages 1233-1252, November.
    27. Bornmann, Lutz & Schier, Hermann & Marx, Werner & Daniel, Hans-Dieter, 2012. "What factors determine citation counts of publications in chemistry besides their quality?," Journal of Informetrics, Elsevier, vol. 6(1), pages 11-18.
    28. Weiren Wang & Felix Famoye, 1997. "Modeling household fertility decisions with generalized Poisson regression," Journal of Population Economics, Springer;European Society for Population Economics, vol. 10(3), pages 273-283.
    Full references (including those not matched with items on IDEAS)

    Citations

    Citations are extracted by the CitEc Project, subscribe to its RSS feed for this item.
    as


    Cited by:

    1. Shuo Xu & Mengjia An & Xin An, 2021. "Do scientific publications by editorial board members have shorter publication delays and then higher influence?," Scientometrics, Springer;Akadémiai Kiadó, vol. 126(8), pages 6697-6713, August.
    2. Wang, Zhiqi & Chen, Yue & Glänzel, Wolfgang, 2020. "Preprints as accelerator of scholarly communication: An empirical analysis in Mathematics," Journal of Informetrics, Elsevier, vol. 14(4).
    3. Carmela Lutmar & Yaniv Reingewertz, 2021. "Academic in-group bias in the top five economics journals," Scientometrics, Springer;Akadémiai Kiadó, vol. 126(12), pages 9543-9556, December.
    4. Lutmar, Carmela & Reingewertz, Yaniv, 2020. "Academic in-group bias in economics," MPRA Paper 104730, University Library of Munich, Germany.
    5. Yan Yan & Shanwu Tian & Jingjing Zhang, 2020. "The impact of a paper’s new combinations and new components on its citation," Scientometrics, Springer;Akadémiai Kiadó, vol. 122(2), pages 895-913, February.
    6. Copiello, Sergio, 2019. "Peer and neighborhood effects: Citation analysis using a spatial autoregressive model and pseudo-spatial data," Journal of Informetrics, Elsevier, vol. 13(1), pages 238-254.
    7. Nataliya N. Matveeva & Oleg V. Poldin, 2017. "How Network Characteristics of Researchers Relate to Their Citation Indicators – a Co-Authorship Network Analysis Based on Google Scholar," HSE Working papers WP BRP 44/EDU/2017, National Research University Higher School of Economics.
    8. Zahedi, Zohreh & Haustein, Stefanie, 2018. "On the relationships between bibliographic characteristics of scientific documents and citation and Mendeley readership counts: A large-scale analysis of Web of Science publications," Journal of Informetrics, Elsevier, vol. 12(1), pages 191-202.
    9. Mike Thelwall, 2016. "Interpreting correlations between citation counts and other indicators," Scientometrics, Springer;Akadémiai Kiadó, vol. 108(1), pages 337-347, July.
    10. Matveeva, Nataliya & Poldin, Oleg, 2016. "Citation of scholars in co-authorship network: Analysis of Google Scholar data," Applied Econometrics, Russian Presidential Academy of National Economy and Public Administration (RANEPA), vol. 44, pages 100-118.
    11. Ren, Haiying & Zhao, Yuhui, 2021. "Technology opportunity discovery based on constructing, evaluating, and searching knowledge networks," Technovation, Elsevier, vol. 101(C).
    12. Shanwu Tian & Xiurui Xu & Ping Li, 2021. "Acknowledgement network and citation count: the moderating role of collaboration network," Scientometrics, Springer;Akadémiai Kiadó, vol. 126(9), pages 7837-7857, September.
    13. José Rodríguez-Avi & María José Olmo-Jiménez, 2017. "A regression model for overdispersed data without too many zeros," Statistical Papers, Springer, vol. 58(3), pages 749-773, September.
    14. Salim Moussa, 2022. "The propagation of error: retracted articles in marketing and their citations," Italian Journal of Marketing, Springer, vol. 2022(1), pages 11-36, March.
    15. Guan, Jiancheng & Yan, Yan & Zhang, Jing Jing, 2017. "The impact of collaboration and knowledge networks on citations," Journal of Informetrics, Elsevier, vol. 11(2), pages 407-422.
    16. Reingewertz, Yaniv & Lutmar, Carmela, 2018. "Academic in-group bias: An empirical examination of the link between author and journal affiliation," Journal of Informetrics, Elsevier, vol. 12(1), pages 74-86.
    17. Thelwall, Mike, 2016. "The discretised lognormal and hooked power law distributions for complete citation data: Best options for modelling and regression," Journal of Informetrics, Elsevier, vol. 10(2), pages 336-346.
    18. Cao, Xuanyu & Chen, Yan & Ray Liu, K.J., 2016. "A data analytic approach to quantifying scientific impact," Journal of Informetrics, Elsevier, vol. 10(2), pages 471-484.

    Most related items

    These are the items that most often cite the same works as this one and are cited by the same works as this one.
    1. Iman Tahamtan & Askar Safipour Afshar & Khadijeh Ahamdzadeh, 2016. "Factors affecting number of citations: a comprehensive review of the literature," Scientometrics, Springer;Akadémiai Kiadó, vol. 107(3), pages 1195-1225, June.
    2. Martorell Cunil, Onofre & Otero González, Luis & Durán Santomil, Pablo & Mulet Forteza, Carlos, 2023. "How to accomplish a highly cited paper in the tourism, leisure and hospitality field," Journal of Business Research, Elsevier, vol. 157(C).
    3. Kaile Gong & Juan Xie & Ying Cheng & Vincent Larivière & Cassidy R. Sugimoto, 2019. "The citation advantage of foreign language references for Chinese social science papers," Scientometrics, Springer;Akadémiai Kiadó, vol. 120(3), pages 1439-1460, September.
    4. Thelwall, Mike & Wilson, Paul, 2014. "Regression for citation data: An evaluation of different methods," Journal of Informetrics, Elsevier, vol. 8(4), pages 963-971.
    5. Mingyang Wang & Zhenyu Wang & Guangsheng Chen, 2019. "Which can better predict the future success of articles? Bibliometric indices or alternative metrics," Scientometrics, Springer;Akadémiai Kiadó, vol. 119(3), pages 1575-1595, June.
    6. Elizabeth S. Vieira, 2023. "The influence of research collaboration on citation impact: the countries in the European Innovation Scoreboard," Scientometrics, Springer;Akadémiai Kiadó, vol. 128(6), pages 3555-3579, June.
    7. Tahamtan, Iman & Bornmann, Lutz, 2018. "Core elements in the process of citing publications: Conceptual overview of the literature," Journal of Informetrics, Elsevier, vol. 12(1), pages 203-216.
    8. Copiello, Sergio, 2019. "Peer and neighborhood effects: Citation analysis using a spatial autoregressive model and pseudo-spatial data," Journal of Informetrics, Elsevier, vol. 13(1), pages 238-254.
    9. Zhang, Xinyuan & Xie, Qing & Song, Min, 2021. "Measuring the impact of novelty, bibliometric, and academic-network factors on citation count using a neural network," Journal of Informetrics, Elsevier, vol. 15(2).
    10. Didegah, Fereshteh & Thelwall, Mike, 2013. "Which factors help authors produce the highest impact research? Collaboration, journal and document properties," Journal of Informetrics, Elsevier, vol. 7(4), pages 861-873.
    11. Yezhu Wang & Yundong Xie & Dong Wang & Lu Guo & Rongting Zhou, 2022. "Do cover papers get better citations and usage counts? An analysis of 42 journals in cell biology," Scientometrics, Springer;Akadémiai Kiadó, vol. 127(7), pages 3793-3813, July.
    12. Stegehuis, Clara & Litvak, Nelly & Waltman, Ludo, 2015. "Predicting the long-term citation impact of recent publications," Journal of Informetrics, Elsevier, vol. 9(3), pages 642-657.
    13. Zahedi, Zohreh & Haustein, Stefanie, 2018. "On the relationships between bibliographic characteristics of scientific documents and citation and Mendeley readership counts: A large-scale analysis of Web of Science publications," Journal of Informetrics, Elsevier, vol. 12(1), pages 191-202.
    14. Reingewertz, Yaniv & Lutmar, Carmela, 2018. "Academic in-group bias: An empirical examination of the link between author and journal affiliation," Journal of Informetrics, Elsevier, vol. 12(1), pages 74-86.
    15. Stefano Mammola & Diego Fontaneto & Alejandro Martínez & Filipe Chichorro, 2021. "Impact of the reference list features on the number of citations," Scientometrics, Springer;Akadémiai Kiadó, vol. 126(1), pages 785-799, January.
    16. Uddin, Shahadat & Khan, Arif, 2016. "The impact of author-selected keywords on citation counts," Journal of Informetrics, Elsevier, vol. 10(4), pages 1166-1177.
    17. Yifan Qian & Wenge Rong & Nan Jiang & Jie Tang & Zhang Xiong, 2017. "Citation regression analysis of computer science publications in different ranking categories and subfields," Scientometrics, Springer;Akadémiai Kiadó, vol. 110(3), pages 1351-1374, March.
    18. Hanssen, Thor-Erik Sandberg & Jørgensen, Finn, 2015. "The value of experience in research," Journal of Informetrics, Elsevier, vol. 9(1), pages 16-24.
    19. Barbara McGillivray & Mathias Astell, 2019. "The relationship between usage and citations in an open access mega-journal," Scientometrics, Springer;Akadémiai Kiadó, vol. 121(2), pages 817-838, November.
    20. Kong, Ling & Wang, Dongbo, 2020. "Comparison of citations and attention of cover and non-cover papers," Journal of Informetrics, Elsevier, vol. 14(4).

    Corrections

    All material on this site has been provided by the respective publishers and authors. You can help correct errors and omissions. When requesting a correction, please mention this item's handle: RePEc:eee:infome:v:9:y:2015:i:3:p:499-513. See general information about how to correct material in RePEc.

    If you have authored this item and are not yet registered with RePEc, we encourage you to do it here. This allows to link your profile to this item. It also allows you to accept potential citations to this item that we are uncertain about.

    If CitEc recognized a bibliographic reference but did not link an item in RePEc to it, you can help with this form .

    If you know of missing items citing this one, you can help us creating those links by adding the relevant references in the same way as above, for each refering item. If you are a registered author of this item, you may also want to check the "citations" tab in your RePEc Author Service profile, as there may be some citations waiting for confirmation.

    For technical questions regarding this item, or to correct its authors, title, abstract, bibliographic or download information, contact: Catherine Liu (email available below). General contact details of provider: http://www.elsevier.com/locate/joi .

    Please note that corrections may take a couple of weeks to filter through the various RePEc services.

    IDEAS is a RePEc service. RePEc uses bibliographic data supplied by the respective publishers.