IDEAS home Printed from https://ideas.repec.org/a/eee/infome/v12y2018i2p401-415.html
   My bibliography  Save this article

The lognormal distribution explains the remarkable pattern documented by characteristic scores and scales in scientometrics

Author

Listed:
  • Vîiu, Gabriel-Alexandru

Abstract

Characteristic scores and scales (CSS) – a well-established scientometric tool for the study of citation counts – have been used to document a striking phenomenon that characterizes citation distributions at high levels of aggregation: irrespective of scientific field and citation window empirical studies find a persistent pattern whereby about 70% of scientific papers belong to the class of poorly cited papers, about 21% belong to the class of fairly cited papers, 6% to that of remarkably cited papers and 3% to the class of outstandingly cited papers. This article aims to advance the understanding of this remarkable result by examining it in the context of the lognormal distribution, a popular model used to describe citation counts across scientific fields. The article shows that the application of the CSS method to lognormal distributions provides a very good fit to the 70–21–6–3% empirical pattern provided these distributions are characterized by a standard deviation parameter in the range of about 0.8–1.3. The CSS pattern is essentially explainable as an epiphenomenon of the lognormal functional form and, more generally, as a consequence of the skewness of science which is manifest in heavy-tailed citation distributions.

Suggested Citation

  • Vîiu, Gabriel-Alexandru, 2018. "The lognormal distribution explains the remarkable pattern documented by characteristic scores and scales in scientometrics," Journal of Informetrics, Elsevier, vol. 12(2), pages 401-415.
  • Handle: RePEc:eee:infome:v:12:y:2018:i:2:p:401-415
    DOI: 10.1016/j.joi.2018.02.002
    as

    Download full text from publisher

    File URL: http://www.sciencedirect.com/science/article/pii/S1751157717303887
    Download Restriction: Full text for ScienceDirect subscribers only

    File URL: https://libkey.io/10.1016/j.joi.2018.02.002?utm_source=ideas
    LibKey link: if access is restricted and if your library uses this service, LibKey will redirect you to where you can use your library subscription to access this item
    ---><---

    As the access to this document is restricted, you may want to search for a different version of it.

    References listed on IDEAS

    as
    1. Pedro Albarrán & Javier Ruiz‐Castillo, 2011. "References made and citations received by scientific articles," Journal of the American Society for Information Science and Technology, Association for Information Science & Technology, vol. 62(1), pages 40-49, January.
    2. Michael J. Stringer & Marta Sales-Pardo & Luís A. Nunes Amaral, 2010. "Statistical validation of a global model for the distribution of the ultimate number of citations accrued by papers published in a scientific journal," Journal of the Association for Information Science & Technology, Association for Information Science & Technology, vol. 61(7), pages 1377-1385, July.
    3. Antonio Perianes-Rodriguez & Javier Ruiz-Castillo, 2016. "University citation distributions," Journal of the Association for Information Science & Technology, Association for Information Science & Technology, vol. 67(11), pages 2790-2804, November.
    4. Michal Brzezinski, 2015. "Power laws in citation distributions: evidence from Scopus," Scientometrics, Springer;Akadémiai Kiadó, vol. 103(1), pages 213-228, April.
    5. Thelwall, Mike & Wilson, Paul, 2014. "Distributions for cited articles from individual subjects and years," Journal of Informetrics, Elsevier, vol. 8(4), pages 824-839.
    6. Pedro Albarrán & Juan A. Crespo & Ignacio Ortuño & Javier Ruiz-Castillo, 2011. "The skewness of science in 219 sub-fields and a number of aggregates," Scientometrics, Springer;Akadémiai Kiadó, vol. 88(2), pages 385-397, August.
    7. T. S. Evans & N. Hopkins & B. S. Kaube, 2012. "Universality of performance indicators based on citation and reference counts," Scientometrics, Springer;Akadémiai Kiadó, vol. 93(2), pages 473-495, November.
    8. Thelwall, Mike, 2016. "The discretised lognormal and hooked power law distributions for complete citation data: Best options for modelling and regression," Journal of Informetrics, Elsevier, vol. 10(2), pages 336-346.
    9. Mike Thelwall & Paul Wilson, 2016. "Mendeley readership altmetrics for medical articles: An analysis of 45 fields," Journal of the Association for Information Science & Technology, Association for Information Science & Technology, vol. 67(8), pages 1962-1972, August.
    10. Ruiz-Castillo, Javier & Costas, Rodrigo, 2014. "The skewness of scientific productivity," Journal of Informetrics, Elsevier, vol. 8(4), pages 917-934.
    11. Wan Jing Low & Paul Wilson & Mike Thelwall, 2016. "Stopped sum models and proposed variants for citation data," Scientometrics, Springer;Akadémiai Kiadó, vol. 107(2), pages 369-384, May.
    12. Vieira, E.S. & Gomes, J.A.N.F., 2010. "Citations to scientific articles: Its distribution and dependence on the article features," Journal of Informetrics, Elsevier, vol. 4(1), pages 1-13.
    13. Wu, Jiang, 2013. "Investigating the universal distributions of normalized indicators and developing field-independent index," Journal of Informetrics, Elsevier, vol. 7(1), pages 63-71.
    14. Li, Yunrong & Radicchi, Filippo & Castellano, Claudio & Ruiz-Castillo, Javier, 2013. "Quantitative evaluation of alternative field normalization procedures," Journal of Informetrics, Elsevier, vol. 7(3), pages 746-755.
    15. Thelwall, Mike, 2016. "The precision of the arithmetic mean, geometric mean and percentiles for citation data: An experimental simulation modelling approach," Journal of Informetrics, Elsevier, vol. 10(1), pages 110-123.
    16. Per O. Seglen, 1992. "The skewness of science," Journal of the American Society for Information Science, Association for Information Science & Technology, vol. 43(9), pages 628-638, October.
    17. Steven A. Morris, 2005. "Manifestation of emerging specialties in journal literature: A growth model of papers, references, exemplars, bibliographic coupling, cocitation, and clustering coefficient distribution," Journal of the American Society for Information Science and Technology, Association for Information Science & Technology, vol. 56(12), pages 1250-1273, October.
    18. Thelwall, Mike, 2016. "Are the discretised lognormal and hooked power law distributions plausible for citation data?," Journal of Informetrics, Elsevier, vol. 10(2), pages 454-470.
    19. Henk F. Moed & Gali Halevi, 2015. "Multidimensional assessment of scholarly research impact," Journal of the Association for Information Science & Technology, Association for Information Science & Technology, vol. 66(10), pages 1988-2002, October.
    20. Thelwall, Mike, 2016. "Citation count distributions for large monodisciplinary journals," Journal of Informetrics, Elsevier, vol. 10(3), pages 863-874.
    21. Arnab Chatterjee & Asim Ghosh & Bikas K Chakrabarti, 2016. "Universality of Citation Distributions for Academic Institutions and Journals," PLOS ONE, Public Library of Science, vol. 11(1), pages 1-11, January.
    22. Wolfgang Glänzel & Bart Thijs & Koenraad Debackere, 2014. "The application of citation-based performance classes to the disciplinary and multidisciplinary assessment in national comparison and institutional research assessment," Scientometrics, Springer;Akadémiai Kiadó, vol. 101(2), pages 939-952, November.
    23. Perc, Matjaž, 2010. "Zipf’s law and log-normal distributions in measures of scientific output across fields and institutions: 40 years of Slovenia’s research as an example," Journal of Informetrics, Elsevier, vol. 4(3), pages 358-364.
    24. Quentin L. Burrell, 2002. "Modelling citation age data: Simple graphical methods from reliability theory," Scientometrics, Springer;Akadémiai Kiadó, vol. 55(2), pages 273-285, August.
    25. Michael J. Kurtz & Edwin A. Henneken, 2017. "Measuring metrics - a 40-year longitudinal cross-validation of citations, downloads, and peer review in astrophysics," Journal of the Association for Information Science & Technology, Association for Information Science & Technology, vol. 68(3), pages 695-708, March.
    26. Wallace, Matthew L. & Larivière, Vincent & Gingras, Yves, 2009. "Modeling a century of citation distributions," Journal of Informetrics, Elsevier, vol. 3(4), pages 296-303.
    27. Filippo Radicchi & Claudio Castellano, 2012. "A Reverse Engineering Approach to the Suppression of Citation Biases Reveals Universal Properties of Citation Distributions," PLOS ONE, Public Library of Science, vol. 7(3), pages 1-9, March.
    28. Abramo, Giovanni & D’Angelo, Ciriaco Andrea & Soldatenkova, Anastasiia, 2017. "An investigation on the skewness patterns and fractal nature of research productivity distributions at field and discipline level," Journal of Informetrics, Elsevier, vol. 11(1), pages 324-335.
    29. Ruiz-Castillo, Javier & Waltman, Ludo, 2015. "Field-normalized citation impact indicators using algorithmically constructed classification systems of science," Journal of Informetrics, Elsevier, vol. 9(1), pages 102-117.
    30. Ludo Waltman & Nees Jan van Eck & Anthony F. J. van Raan, 2012. "Universality of citation distributions revisited," Journal of the Association for Information Science & Technology, Association for Information Science & Technology, vol. 63(1), pages 72-77, January.
    31. Andrea Bonaccorsi & Cinzia Daraio & Stefano Fantoni & Viola Folli & Marco Leonetti & Giancarlo Ruocco, 2017. "Do social sciences and humanities behave like life and hard sciences?," Scientometrics, Springer;Akadémiai Kiadó, vol. 112(1), pages 607-653, July.
    32. Leo Egghe & I. K. R. Ravichandra Rao, 2002. "Theory and experimentation on the most-recent-reference distribution," Scientometrics, Springer;Akadémiai Kiadó, vol. 53(3), pages 371-387, March.
    Full references (including those not matched with items on IDEAS)

    Citations

    Citations are extracted by the CitEc Project, subscribe to its RSS feed for this item.
    as


    Cited by:

    1. Alonso Rodríguez-Navarro & Ricardo Brito, 2019. "Probability and expected frequency of breakthroughs: basis and use of a robust method of research assessment," Scientometrics, Springer;Akadémiai Kiadó, vol. 119(1), pages 213-235, April.
    2. Brito, Ricardo & Navarro, Alonso Rodríguez, 2021. "The inconsistency of h-index: A mathematical analysis," Journal of Informetrics, Elsevier, vol. 15(1).

    Most related items

    These are the items that most often cite the same works as this one and are cited by the same works as this one.
    1. Ruiz-Castillo, Javier & Costas, Rodrigo, 2018. "Individual and field citation distributions in 29 broad scientific fields," Journal of Informetrics, Elsevier, vol. 12(3), pages 868-892.
    2. Brito, Ricardo & Rodríguez-Navarro, Alonso, 2018. "Research assessment by percentile-based double rank analysis," Journal of Informetrics, Elsevier, vol. 12(1), pages 315-329.
    3. Thelwall, Mike, 2016. "Are the discretised lognormal and hooked power law distributions plausible for citation data?," Journal of Informetrics, Elsevier, vol. 10(2), pages 454-470.
    4. Vîiu, Gabriel-Alexandru, 2017. "Disaggregated research evaluation through median-based characteristic scores and scales: a comparison with the mean-based approach," Journal of Informetrics, Elsevier, vol. 11(3), pages 748-765.
    5. Ruiz-Castillo, Javier & Waltman, Ludo, 2015. "Field-normalized citation impact indicators using algorithmically constructed classification systems of science," Journal of Informetrics, Elsevier, vol. 9(1), pages 102-117.
    6. Ruiz-Castillo, Javier & Costas, Rodrigo, 2014. "The skewness of scientific productivity," Journal of Informetrics, Elsevier, vol. 8(4), pages 917-934.
    7. Zhihui Zhang & Ying Cheng & Nian Cai Liu, 2015. "Improving the normalization effect of mean-based method from the perspective of optimization: optimization-based linear methods and their performance," Scientometrics, Springer;Akadémiai Kiadó, vol. 102(1), pages 587-607, January.
    8. Waltman, Ludo, 2016. "A review of the literature on citation impact indicators," Journal of Informetrics, Elsevier, vol. 10(2), pages 365-391.
    9. Antonio Perianes-Rodriguez & Javier Ruiz-Castillo, 2016. "A comparison of two ways of evaluating research units working in different scientific fields," Scientometrics, Springer;Akadémiai Kiadó, vol. 106(2), pages 539-561, February.
    10. Antonio Perianes-Rodriguez & Javier Ruiz-Castillo, 2016. "University citation distributions," Journal of the Association for Information Science & Technology, Association for Information Science & Technology, vol. 67(11), pages 2790-2804, November.
    11. Alonso Rodríguez-Navarro & Ricardo Brito, 2019. "Probability and expected frequency of breakthroughs: basis and use of a robust method of research assessment," Scientometrics, Springer;Akadémiai Kiadó, vol. 119(1), pages 213-235, April.
    12. Rodríguez-Navarro, Alonso & Brito, Ricardo, 2018. "Double rank analysis for research assessment," Journal of Informetrics, Elsevier, vol. 12(1), pages 31-41.
    13. Thelwall, Mike, 2017. "Three practical field normalised alternative indicator formulae for research evaluation," Journal of Informetrics, Elsevier, vol. 11(1), pages 128-151.
    14. Bouyssou, Denis & Marchant, Thierry, 2016. "Ranking authors using fractional counting of citations: An axiomatic approach," Journal of Informetrics, Elsevier, vol. 10(1), pages 183-199.
    15. Marcel Clermont & Johanna Krolak & Dirk Tunger, 2021. "Does the citation period have any effect on the informative value of selected citation indicators in research evaluations?," Scientometrics, Springer;Akadémiai Kiadó, vol. 126(2), pages 1019-1047, February.
    16. Abramo, Giovanni & Cicero, Tindaro & D’Angelo, Ciriaco Andrea, 2012. "How important is choice of the scaling factor in standardizing citations?," Journal of Informetrics, Elsevier, vol. 6(4), pages 645-654.
    17. S. R. Goldberg & H. Anthony & T. S. Evans, 2015. "Modelling citation networks," Scientometrics, Springer;Akadémiai Kiadó, vol. 105(3), pages 1577-1604, December.
    18. Brito, Ricardo & Navarro, Alonso Rodríguez, 2021. "The inconsistency of h-index: A mathematical analysis," Journal of Informetrics, Elsevier, vol. 15(1).
    19. Thelwall, Mike & Wilson, Paul, 2014. "Regression for citation data: An evaluation of different methods," Journal of Informetrics, Elsevier, vol. 8(4), pages 963-971.
    20. Thelwall, Mike & Fairclough, Ruth, 2017. "The accuracy of confidence intervals for field normalised indicators," Journal of Informetrics, Elsevier, vol. 11(2), pages 530-540.

    Corrections

    All material on this site has been provided by the respective publishers and authors. You can help correct errors and omissions. When requesting a correction, please mention this item's handle: RePEc:eee:infome:v:12:y:2018:i:2:p:401-415. See general information about how to correct material in RePEc.

    If you have authored this item and are not yet registered with RePEc, we encourage you to do it here. This allows to link your profile to this item. It also allows you to accept potential citations to this item that we are uncertain about.

    If CitEc recognized a bibliographic reference but did not link an item in RePEc to it, you can help with this form .

    If you know of missing items citing this one, you can help us creating those links by adding the relevant references in the same way as above, for each refering item. If you are a registered author of this item, you may also want to check the "citations" tab in your RePEc Author Service profile, as there may be some citations waiting for confirmation.

    For technical questions regarding this item, or to correct its authors, title, abstract, bibliographic or download information, contact: Catherine Liu (email available below). General contact details of provider: http://www.elsevier.com/locate/joi .

    Please note that corrections may take a couple of weeks to filter through the various RePEc services.

    IDEAS is a RePEc service. RePEc uses bibliographic data supplied by the respective publishers.