IDEAS home Printed from https://ideas.repec.org/a/eee/infome/v12y2018i4p1072-1088.html
   My bibliography  Save this article

Impact Factors and the Central Limit Theorem: Why citation averages are scale dependent

Author

Listed:
  • Antonoyiannakis, Manolis

Abstract

Citation averages, and Impact Factors (IFs) in particular, are sensitive to sample size. Here, we apply the Central Limit Theorem to IFs to understand their scale-dependent behavior. For a journal of n randomly selected papers from a population of all papers, we expect from the Theorem that its IF fluctuates around the population average μ, and spans a range of values proportional to σ/n, where σ2 is the variance of the population's citation distribution. The 1/n dependence has profound implications for IF rankings: The larger a journal, the narrower the range around μ where its IF lies. IF rankings therefore allocate an unfair advantage to smaller journals in the high IF ranks, and to larger journals in the low IF ranks. As a result, we expect a scale-dependent stratification of journals in IF rankings, whereby small journals occupy the top, middle, and bottom ranks; mid-sized journals occupy the middle ranks; and very large journals have IFs that asymptotically approach μ. We obtain qualitative and quantitative confirmation of these predictions by analyzing (i) the complete set of 166,498 IF & journal-size data pairs in the 1997–2016 Journal Citation Reports of Clarivate Analytics, (ii) the top-cited portion of 276,000 physics papers published in 2014–2015, and (iii) the citation distributions of an arbitrarily sampled list of physics journals. We conclude that the Central Limit Theorem is a good predictor of the IF range of actual journals, while sustained deviations from its predictions are a mark of true, non-random, citation impact. IF rankings are thus misleading unless one compares like-sized journals or adjusts for these effects. We propose the Φ index, a rescaled IF that accounts for size effects, and which can be readily generalized to account also for different citation practices across research fields. Our methodology applies to other citation averages that are used to compare research fields, university departments or countries in various types of rankings.

Suggested Citation

  • Antonoyiannakis, Manolis, 2018. "Impact Factors and the Central Limit Theorem: Why citation averages are scale dependent," Journal of Informetrics, Elsevier, vol. 12(4), pages 1072-1088.
  • Handle: RePEc:eee:infome:v:12:y:2018:i:4:p:1072-1088
    DOI: 10.1016/j.joi.2018.08.011
    as

    Download full text from publisher

    File URL: http://www.sciencedirect.com/science/article/pii/S1751157718301238
    Download Restriction: Full text for ScienceDirect subscribers only

    File URL: https://libkey.io/10.1016/j.joi.2018.08.011?utm_source=ideas
    LibKey link: if access is restricted and if your library uses this service, LibKey will redirect you to where you can use your library subscription to access this item
    ---><---

    As the access to this document is restricted, you may want to search for a different version of it.

    References listed on IDEAS

    as
    1. Ludo Waltman & Clara Calero‐Medina & Joost Kosten & Ed C.M. Noyons & Robert J.W. Tijssen & Nees Jan van Eck & Thed N. van Leeuwen & Anthony F.J. van Raan & Martijn S. Visser & Paul Wouters, 2012. "The Leiden ranking 2011/2012: Data collection, indicators, and interpretation," Journal of the American Society for Information Science and Technology, Association for Information Science & Technology, vol. 63(12), pages 2419-2432, December.
    2. Wolfgang Glänzel & Henk F. Moed, 2013. "Opinion paper: thoughts and facts on bibliometric indicators," Scientometrics, Springer;Akadémiai Kiadó, vol. 96(1), pages 381-394, July.
    3. S. Redner, 1998. "How popular is your paper? An empirical study of the citation distribution," The European Physical Journal B: Condensed Matter and Complex Systems, Springer;EDP Sciences, vol. 4(2), pages 131-134, July.
    4. Benjamin M. Althouse & Jevin D. West & Carl T. Bergstrom & Theodore Bergstrom, 2009. "Differences in impact factor across fields and over time," Journal of the American Society for Information Science and Technology, Association for Information Science & Technology, vol. 60(1), pages 27-34, January.
    5. Anthony F.J. van Raan, 2008. "Bibliometric statistical properties of the 100 largest European research universities: Prevalent scaling rules in the science system," Journal of the American Society for Information Science and Technology, Association for Information Science & Technology, vol. 59(3), pages 461-475, February.
    6. J Sylvan Katz, 2000. "Scale-independent indicators and research evaluation," Science and Public Policy, Oxford University Press, vol. 27(1), pages 23-36, February.
    7. Paula Stephan & Reinhilde Veugelers & Jian Wang, 2017. "Reviewers are blinkered by bibliometrics," Nature, Nature, vol. 544(7651), pages 411-412, April.
    8. Ronald Rousseau & Guido Van Hooydonk, 1996. "Journal production and journal impact factors," Journal of the American Society for Information Science, Association for Information Science & Technology, vol. 47(10), pages 775-780, October.
    9. Loet Leydesdorff & Lutz Bornmann, 2011. "Integrated impact indicators compared with impact factors: An alternative research design with policy implications," Journal of the Association for Information Science & Technology, Association for Information Science & Technology, vol. 62(11), pages 2133-2146, November.
    10. Juan A Crespo & Ignacio Ortuño-Ortín & Javier Ruiz-Castillo, 2012. "The Citation Merit of Scientific Publications," PLOS ONE, Public Library of Science, vol. 7(11), pages 1-9, November.
    11. Per O. Seglen, 1992. "The skewness of science," Journal of the American Society for Information Science, Association for Information Science & Technology, vol. 43(9), pages 628-638, October.
    12. Ludo Waltman & Clara Calero-Medina & Joost Kosten & Ed C.M. Noyons & Robert J.W. Tijssen & Nees Jan Eck & Thed N. Leeuwen & Anthony F.J. Raan & Martijn S. Visser & Paul Wouters, 2012. "The Leiden ranking 2011/2012: Data collection, indicators, and interpretation," Journal of the Association for Information Science & Technology, Association for Information Science & Technology, vol. 63(12), pages 2419-2432, December.
    13. Waltman, Ludo, 2016. "A review of the literature on citation impact indicators," Journal of Informetrics, Elsevier, vol. 10(2), pages 365-391.
    14. Wall Howard J, 2009. "Don't Get Skewed Over by Journal Rankings," The B.E. Journal of Economic Analysis & Policy, De Gruyter, vol. 9(1), pages 1-12, August.
    15. Gelman, Andrew & Nolan, Deborah, 2002. "Teaching Statistics: A Bag of Tricks," OUP Catalogue, Oxford University Press, number 9780198572244.
    16. Lutz Bornmann & Loet Leydesdorff, 2018. "Count highly-cited papers instead of papers with h citations: use normalized citation counts and compare “like with like”!," Scientometrics, Springer;Akadémiai Kiadó, vol. 115(2), pages 1119-1123, May.
    17. Bornmann, Lutz & Leydesdorff, Loet, 2017. "Skewness of citation impact data and covariates of citation distributions: A large-scale empirical analysis based on Web of Science data," Journal of Informetrics, Elsevier, vol. 11(1), pages 164-175.
    18. Gelman, Andrew & Nolan, Deborah, 2002. "Teaching Statistics: A Bag of Tricks," OUP Catalogue, Oxford University Press, number 9780198572251.
    19. Thelwall, Mike & Fairclough, Ruth, 2015. "Geometric journal impact factors correcting for individual highly cited articles," Journal of Informetrics, Elsevier, vol. 9(2), pages 263-272.
    20. Huang, Ding-wei, 2016. "Positive correlation between quality and quantity in academic journals," Journal of Informetrics, Elsevier, vol. 10(2), pages 329-335.
    Full references (including those not matched with items on IDEAS)

    Citations

    Citations are extracted by the CitEc Project, subscribe to its RSS feed for this item.
    as


    Cited by:

    1. Prem Kumar Singh, 2022. "t-index: entropy based random document and citation analysis using average h-index," Scientometrics, Springer;Akadémiai Kiadó, vol. 127(1), pages 637-660, January.
    2. Guoliang Lyu & Ganwei Shi, 2019. "On an approach to boosting a journal’s citation potential," Scientometrics, Springer;Akadémiai Kiadó, vol. 120(3), pages 1387-1409, September.
    3. Loet Leydesdorff & Lutz Bornmann & Jonathan Adams, 2019. "The integrated impact indicator revisited (I3*): a non-parametric alternative to the journal impact factor," Scientometrics, Springer;Akadémiai Kiadó, vol. 119(3), pages 1669-1694, June.
    4. Mingyang Wang & Shijia Jiao & Kah-Hin Chai & Guangsheng Chen, 2019. "Building journal’s long-term impact: using indicators detected from the sustained active articles," Scientometrics, Springer;Akadémiai Kiadó, vol. 121(1), pages 261-283, October.
    5. Gangan Prathap, 2019. "Scale-dependent stratification: a skyline–shoreline scatter plot," Scientometrics, Springer;Akadémiai Kiadó, vol. 119(2), pages 1269-1273, May.
    6. Raminta Pranckutė, 2021. "Web of Science (WoS) and Scopus: The Titans of Bibliographic Information in Today’s Academic World," Publications, MDPI, vol. 9(1), pages 1-59, March.
    7. Pislyakov, Vladimir, 2022. "On some properties of medians, percentiles, baselines, and thresholds in empirical bibliometric analysis," Journal of Informetrics, Elsevier, vol. 16(4).
    8. Emanuel Kulczycki & Marek Hołowiecki & Zehra Taşkın & Franciszek Krawczyk, 2021. "Citation patterns between impact-factor and questionable journals," Scientometrics, Springer;Akadémiai Kiadó, vol. 126(10), pages 8541-8560, October.
    9. Wu, Lingfei & Kittur, Aniket & Youn, Hyejin & Milojević, Staša & Leahey, Erin & Fiore, Stephen M. & Ahn, Yong-Yeol, 2022. "Metrics and mechanisms: Measuring the unmeasurable in the science of science," Journal of Informetrics, Elsevier, vol. 16(2).

    Most related items

    These are the items that most often cite the same works as this one and are cited by the same works as this one.
    1. Zhesi Shen & Liying Yang & Zengru Di & Jinshan Wu, 2019. "Large enough sample size to rank two groups of data reliably according to their means," Scientometrics, Springer;Akadémiai Kiadó, vol. 118(2), pages 653-671, February.
    2. Lutz Bornmann & Klaus Wohlrabe, 2019. "Normalisation of citation impact in economics," Scientometrics, Springer;Akadémiai Kiadó, vol. 120(2), pages 841-884, August.
    3. Milojević, Staša & Radicchi, Filippo & Bar-Ilan, Judit, 2017. "Citation success index − An intuitive pair-wise journal comparison metric," Journal of Informetrics, Elsevier, vol. 11(1), pages 223-231.
    4. Loet Leydesdorff & Lutz Bornmann & Jonathan Adams, 2019. "The integrated impact indicator revisited (I3*): a non-parametric alternative to the journal impact factor," Scientometrics, Springer;Akadémiai Kiadó, vol. 119(3), pages 1669-1694, June.
    5. Waltman, Ludo, 2016. "A review of the literature on citation impact indicators," Journal of Informetrics, Elsevier, vol. 10(2), pages 365-391.
    6. Anthony F J van Raan, 2013. "Universities Scale Like Cities," PLOS ONE, Public Library of Science, vol. 8(3), pages 1-14, March.
    7. Thelwall, Mike, 2016. "Are there too many uncited articles? Zero inflated variants of the discretised lognormal and hooked power law distributions," Journal of Informetrics, Elsevier, vol. 10(2), pages 622-633.
    8. Loet Leydesdorff & Paul Wouters & Lutz Bornmann, 2016. "Professional and citizen bibliometrics: complementarities and ambivalences in the development and use of indicators—a state-of-the-art report," Scientometrics, Springer;Akadémiai Kiadó, vol. 109(3), pages 2129-2150, December.
    9. Yves Fassin, 2020. "The HF-rating as a universal complement to the h-index," Scientometrics, Springer;Akadémiai Kiadó, vol. 125(2), pages 965-990, November.
    10. Lutz Bornmann & Richard Williams, 2020. "An evaluation of percentile measures of citation impact, and a proposal for making them better," Scientometrics, Springer;Akadémiai Kiadó, vol. 124(2), pages 1457-1478, August.
    11. Giancarlo Ruocco & Cinzia Daraio, 2013. "An empirical approach to compare the performance of heterogeneous academic fields," Scientometrics, Springer;Akadémiai Kiadó, vol. 97(3), pages 601-625, December.
    12. Pedro Albarrán & Antonio Perianes-Rodríguez & Javier Ruiz-Castillo, 2015. "Differences in citation impact across countries," Journal of the Association for Information Science & Technology, Association for Information Science & Technology, vol. 66(3), pages 512-525, March.
    13. Giovanni Abramo & Ciriaco Andrea D’Angelo & Flavia Costa, 2023. "Correlating article citedness and journal impact: an empirical investigation by field on a large-scale dataset," Scientometrics, Springer;Akadémiai Kiadó, vol. 128(3), pages 1877-1894, March.
    14. Albarrán, Pedro & Herrero, Carmen & Ruiz-Castillo, Javier & Villar, Antonio, 2017. "The Herrero-Villar approach to citation impact," Journal of Informetrics, Elsevier, vol. 11(2), pages 625-640.
    15. Lutz Bornmann & Adam Y. Ye & Fred Y. Ye, 2018. "Identifying “hot papers” and papers with “delayed recognition” in large-scale datasets by using dynamically normalized citation impact scores," Scientometrics, Springer;Akadémiai Kiadó, vol. 116(2), pages 655-674, August.
    16. Bornmann, Lutz & Leydesdorff, Loet, 2017. "Skewness of citation impact data and covariates of citation distributions: A large-scale empirical analysis based on Web of Science data," Journal of Informetrics, Elsevier, vol. 11(1), pages 164-175.
    17. Gerson Pech & Catarina Delgado, 2020. "Percentile and stochastic-based approach to the comparison of the number of citations of articles indexed in different bibliographic databases," Scientometrics, Springer;Akadémiai Kiadó, vol. 123(1), pages 223-252, April.
    18. Lutz Bornmann & Alexander Tekles & Loet Leydesdorff, 2019. "How well does I3 perform for impact measurement compared to other bibliometric indicators? The convergent validity of several (field-normalized) indicators," Scientometrics, Springer;Akadémiai Kiadó, vol. 119(2), pages 1187-1205, May.
    19. Mingers, John & Leydesdorff, Loet, 2015. "A review of theory and practice in scientometrics," European Journal of Operational Research, Elsevier, vol. 246(1), pages 1-19.
    20. Zhihui Zhang & Ying Cheng & Nian Cai Liu, 2015. "Improving the normalization effect of mean-based method from the perspective of optimization: optimization-based linear methods and their performance," Scientometrics, Springer;Akadémiai Kiadó, vol. 102(1), pages 587-607, January.

    Corrections

    All material on this site has been provided by the respective publishers and authors. You can help correct errors and omissions. When requesting a correction, please mention this item's handle: RePEc:eee:infome:v:12:y:2018:i:4:p:1072-1088. See general information about how to correct material in RePEc.

    If you have authored this item and are not yet registered with RePEc, we encourage you to do it here. This allows to link your profile to this item. It also allows you to accept potential citations to this item that we are uncertain about.

    If CitEc recognized a bibliographic reference but did not link an item in RePEc to it, you can help with this form .

    If you know of missing items citing this one, you can help us creating those links by adding the relevant references in the same way as above, for each refering item. If you are a registered author of this item, you may also want to check the "citations" tab in your RePEc Author Service profile, as there may be some citations waiting for confirmation.

    For technical questions regarding this item, or to correct its authors, title, abstract, bibliographic or download information, contact: Catherine Liu (email available below). General contact details of provider: http://www.elsevier.com/locate/joi .

    Please note that corrections may take a couple of weeks to filter through the various RePEc services.

    IDEAS is a RePEc service. RePEc uses bibliographic data supplied by the respective publishers.