IDEAS home Printed from https://ideas.repec.org/a/eee/infome/v12y2018i3p998-1011.html
   My bibliography  Save this article

Normalization of zero-inflated data: An empirical analysis of a new indicator family and its use with altmetrics data

Author

Listed:
  • Bornmann, Lutz
  • Haunschild, Robin

Abstract

Recently, two new indicators (Equalized Mean-based Normalized Proportion Cited, EMNPC; Mean-based Normalized Proportion Cited, MNPC) were proposed which are intended for sparse scientometrics data, e.g., alternative metrics (altmetrics). The indicators compare the proportion of mentioned papers (e.g. on Facebook) of a unit (e.g., a researcher or institution) with the proportion of mentioned papers in the corresponding fields and publication years (the expected values). In this study, we propose a third indicator (Mantel-Haenszel quotient, MHq) belonging to the same indicator family. The MHq is based on the MH analysis – an established method in statistics for the comparison of proportions. We test (using citations and assessments by peers, i.e. F1000Prime recommendations) if the three indicators can distinguish between different quality levels as defined on the basis of the assessments by peers. Thus, we test their convergent validity. We find that the indicator MHq is able to distinguish between the quality levels in most cases while MNPC and EMNPC are not. Since the MHq is shown in this study to be a valid indicator, we apply it to six types of zero-inflated altmetrics data and test whether different altmetrics sources are related to quality. The results for the various altmetrics demonstrate that the relationship between altmetrics (Wikipedia, Facebook, blogs, and news data) and assessments by peers is not as strong as the relationship between citations and assessments by peers. Actually, the relationship between citations and peer assessments is about two to three times stronger than the association between altmetrics and assessments by peers.

Suggested Citation

  • Bornmann, Lutz & Haunschild, Robin, 2018. "Normalization of zero-inflated data: An empirical analysis of a new indicator family and its use with altmetrics data," Journal of Informetrics, Elsevier, vol. 12(3), pages 998-1011.
  • Handle: RePEc:eee:infome:v:12:y:2018:i:3:p:998-1011
    DOI: 10.1016/j.joi.2018.01.010
    as

    Download full text from publisher

    File URL: http://www.sciencedirect.com/science/article/pii/S1751157717303978
    Download Restriction: Full text for ScienceDirect subscribers only

    File URL: https://libkey.io/10.1016/j.joi.2018.01.010?utm_source=ideas
    LibKey link: if access is restricted and if your library uses this service, LibKey will redirect you to where you can use your library subscription to access this item
    ---><---

    As the access to this document is restricted, you may want to search for a different version of it.

    References listed on IDEAS

    as
    1. Houqiang Yu, 2017. "Context of altmetrics data matters: an investigation of count type and user category," Scientometrics, Springer;Akadémiai Kiadó, vol. 111(1), pages 267-283, April.
    2. Fairclough, Ruth & Thelwall, Mike, 2015. "National research impact indicators from Mendeley readers," Journal of Informetrics, Elsevier, vol. 9(4), pages 845-859.
    3. Bornmann, Lutz & Haunschild, Robin, 2016. "Normalization of Mendeley reader impact on the reader- and paper-side: A comparison of the mean discipline normalized reader score (MDNRS) with the mean normalized reader score (MNRS) and bare reader ," Journal of Informetrics, Elsevier, vol. 10(3), pages 776-788.
    4. Xuemei Li & Mike Thelwall & Dean Giustini, 2012. "Validating online reference managers for scholarly impact measurement," Scientometrics, Springer;Akadémiai Kiadó, vol. 91(2), pages 461-471, May.
    5. Waltman, Ludo & van Eck, Nees Jan & van Leeuwen, Thed N. & Visser, Martijn S. & van Raan, Anthony F.J., 2011. "Towards a new crown indicator: Some theoretical considerations," Journal of Informetrics, Elsevier, vol. 5(1), pages 37-47.
    6. Zohreh Zahedi & Rodrigo Costas & Paul Wouters, 2014. "How well developed are altmetrics? A cross-disciplinary analysis of the presence of ‘alternative metrics’ in scientific publications," Scientometrics, Springer;Akadémiai Kiadó, vol. 101(2), pages 1491-1513, November.
    7. Haunschild, Robin & Bornmann, Lutz, 2016. "Normalization of Mendeley reader counts for impact assessment," Journal of Informetrics, Elsevier, vol. 10(1), pages 62-73.
    8. Lutz Bornmann, 2015. "Interrater reliability and convergent validity of F1000Prime peer review," Journal of the Association for Information Science & Technology, Association for Information Science & Technology, vol. 66(12), pages 2415-2426, December.
    9. Bornmann, Lutz, 2014. "Do altmetrics point to the broader impact of research? An overview of benefits and disadvantages of altmetrics," Journal of Informetrics, Elsevier, vol. 8(4), pages 895-903.
    10. Lutz Bornmann & Robin Haunschild & Werner Marx, 2016. "Policy documents as sources for measuring societal impact: how often is climate change research mentioned in policy-related documents?," Scientometrics, Springer;Akadémiai Kiadó, vol. 109(3), pages 1477-1495, December.
    11. Lutz Bornmann & Robin Haunschild, 2016. "How to normalize Twitter counts? A first attempt based on journals in the Twitter Index," Scientometrics, Springer;Akadémiai Kiadó, vol. 107(3), pages 1405-1422, June.
    12. Ludo Waltman & Nees Jan Eck & Thed N. Leeuwen & Martijn S. Visser & Anthony F. J. Raan, 2011. "Towards a new crown indicator: an empirical analysis," Scientometrics, Springer;Akadémiai Kiadó, vol. 87(3), pages 467-481, June.
    13. Thelwall, Mike, 2017. "Three practical field normalised alternative indicator formulae for research evaluation," Journal of Informetrics, Elsevier, vol. 11(1), pages 128-151.
    14. Williams, Richard & Bornmann, Lutz, 2016. "Sampling issues in bibliometric analysis," Journal of Informetrics, Elsevier, vol. 10(4), pages 1225-1232.
    15. Robin Haunschild & Lutz Bornmann, 2017. "How many scientific papers are mentioned in policy-related documents? An empirical investigation using Web of Science and Altmetric data," Scientometrics, Springer;Akadémiai Kiadó, vol. 110(3), pages 1209-1216, March.
    16. Bornmann, Lutz, 2014. "Validity of altmetrics data for measuring societal impact: A study using data from Altmetric and F1000Prime," Journal of Informetrics, Elsevier, vol. 8(4), pages 935-950.
    17. Ehsan Mohammadi & Mike Thelwall & Kayvan Kousha, 2016. "Can Mendeley bookmarks reflect readership? A survey of user motivations," Journal of the Association for Information Science & Technology, Association for Information Science & Technology, vol. 67(5), pages 1198-1209, May.
    18. Ludo Waltman & Rodrigo Costas, 2014. "F1000 Recommendations as a Potential New Data Source for Research Evaluation: A Comparison With Citations," Journal of the Association for Information Science & Technology, Association for Information Science & Technology, vol. 65(3), pages 433-445, March.
    19. Rons, Nadine, 2012. "Partition-based Field Normalization: An approach to highly specialized publication records," Journal of Informetrics, Elsevier, vol. 6(1), pages 1-10.
    20. Mojisola Erdt & Aarthy Nagarajan & Sei-Ching Joanna Sin & Yin-Leng Theng, 2016. "Altmetrics: an analysis of the state-of-the-art in measuring research impact on social media," Scientometrics, Springer;Akadémiai Kiadó, vol. 109(2), pages 1117-1166, November.
    21. Claveau, François, 2016. "There should not be any mystery: A comment on sampling issues in bibliometrics," Journal of Informetrics, Elsevier, vol. 10(4), pages 1233-1240.
    22. Lutz Bornmann, 2015. "Alternative metrics in scientometrics: a meta-analysis of research into three altmetrics," Scientometrics, Springer;Akadémiai Kiadó, vol. 103(3), pages 1123-1144, June.
    23. Franceschet, Massimo & Costantini, Antonio, 2011. "The first Italian research assessment exercise: A bibliometric perspective," Journal of Informetrics, Elsevier, vol. 5(2), pages 275-291.
    24. Amalia Mas-Bleda & Mike Thelwall, 2016. "Can alternative indicators overcome language biases in citation counts? A comparison of Spanish and UK research," Scientometrics, Springer;Akadémiai Kiadó, vol. 109(3), pages 2007-2030, December.
    Full references (including those not matched with items on IDEAS)

    Citations

    Citations are extracted by the CitEc Project, subscribe to its RSS feed for this item.
    as


    Cited by:

    1. Huang, Cui & Yang, Chao & Su, Jun, 2021. "Identifying core policy instruments based on structural holes: A case study of China’s nuclear energy policy," Journal of Informetrics, Elsevier, vol. 15(2).
    2. Smolinsky, Lawrence & Klingenberg, Bernhard & Marx, Brian D., 2022. "Interpretation and inference for altmetric indicators arising from sparse data statistics," Journal of Informetrics, Elsevier, vol. 16(1).
    3. Sergio Copiello, 2020. "Other than detecting impact in advance, alternative metrics could act as early warning signs of retractions: tentative findings of a study into the papers retracted by PLoS ONE," Scientometrics, Springer;Akadémiai Kiadó, vol. 125(3), pages 2449-2469, December.
    4. Bornmann, Lutz & Haunschild, Robin & Adams, Jonathan, 2019. "Do altmetrics assess societal impact in a comparable way to case studies? An empirical test of the convergent validity of altmetrics based on data from the UK research excellence framework (REF)," Journal of Informetrics, Elsevier, vol. 13(1), pages 325-340.
    5. Jianhua Hou & Bili Zheng & Yang Zhang & Chaomei Chen, 2021. "How do Price medalists’ scholarly impact change before and after their awards?," Scientometrics, Springer;Akadémiai Kiadó, vol. 126(7), pages 5945-5981, July.
    6. Sergio Copiello, 2020. "Multi-criteria altmetric scores are likely to be redundant with respect to a subset of the underlying information," Scientometrics, Springer;Akadémiai Kiadó, vol. 124(1), pages 819-824, July.

    Most related items

    These are the items that most often cite the same works as this one and are cited by the same works as this one.
    1. Robin Haunschild & Lutz Bornmann, 2018. "Field- and time-normalization of data with many zeros: an empirical analysis using citation and Twitter data," Scientometrics, Springer;Akadémiai Kiadó, vol. 116(2), pages 997-1012, August.
    2. Bornmann, Lutz & Haunschild, Robin & Adams, Jonathan, 2019. "Do altmetrics assess societal impact in a comparable way to case studies? An empirical test of the convergent validity of altmetrics based on data from the UK research excellence framework (REF)," Journal of Informetrics, Elsevier, vol. 13(1), pages 325-340.
    3. Thelwall, Mike, 2017. "Three practical field normalised alternative indicator formulae for research evaluation," Journal of Informetrics, Elsevier, vol. 11(1), pages 128-151.
    4. Lutz Bornmann & Rüdiger Mutz & Robin Haunschild & Felix Moya-Anegon & Mirko Almeida Madeira Clemente & Moritz Stefaner, 2021. "Mapping the impact of papers on various status groups in excellencemapping.net: a new release of the excellence mapping tool based on citation and reader scores," Scientometrics, Springer;Akadémiai Kiadó, vol. 126(11), pages 9305-9331, November.
    5. Lutz Bornmann & Robin Haunschild & Vanash M Patel, 2020. "Are papers addressing certain diseases perceived where these diseases are prevalent? The proposal to use Twitter data as social-spatial sensors," PLOS ONE, Public Library of Science, vol. 15(11), pages 1-22, November.
    6. Mojisola Erdt & Aarthy Nagarajan & Sei-Ching Joanna Sin & Yin-Leng Theng, 2016. "Altmetrics: an analysis of the state-of-the-art in measuring research impact on social media," Scientometrics, Springer;Akadémiai Kiadó, vol. 109(2), pages 1117-1166, November.
    7. Bornmann, Lutz & Haunschild, Robin, 2016. "Normalization of Mendeley reader impact on the reader- and paper-side: A comparison of the mean discipline normalized reader score (MDNRS) with the mean normalized reader score (MNRS) and bare reader ," Journal of Informetrics, Elsevier, vol. 10(3), pages 776-788.
    8. Sergio Copiello, 2020. "Other than detecting impact in advance, alternative metrics could act as early warning signs of retractions: tentative findings of a study into the papers retracted by PLoS ONE," Scientometrics, Springer;Akadémiai Kiadó, vol. 125(3), pages 2449-2469, December.
    9. Liwei Zhang & Jue Wang, 2021. "What affects publications’ popularity on Twitter?," Scientometrics, Springer;Akadémiai Kiadó, vol. 126(11), pages 9185-9198, November.
    10. Thelwall, Mike & Fairclough, Ruth, 2017. "The accuracy of confidence intervals for field normalised indicators," Journal of Informetrics, Elsevier, vol. 11(2), pages 530-540.
    11. Zhichao Fang & Rodrigo Costas & Wencan Tian & Xianwen Wang & Paul Wouters, 2020. "An extensive analysis of the presence of altmetric data for Web of Science publications across subject fields and research topics," Scientometrics, Springer;Akadémiai Kiadó, vol. 124(3), pages 2519-2549, September.
    12. Lutz Bornmann & Robin Haunschild, 2016. "How to normalize Twitter counts? A first attempt based on journals in the Twitter Index," Scientometrics, Springer;Akadémiai Kiadó, vol. 107(3), pages 1405-1422, June.
    13. Ying Guo & Xiantao Xiao, 2022. "Author-level altmetrics for the evaluation of Chinese scholars," Scientometrics, Springer;Akadémiai Kiadó, vol. 127(2), pages 973-990, February.
    14. Bornmann, Lutz & Leydesdorff, Loet, 2015. "Does quality and content matter for citedness? A comparison with para-textual factors and over time," Journal of Informetrics, Elsevier, vol. 9(3), pages 419-429.
    15. Mike Thelwall, 2018. "Differences between journals and years in the proportions of students, researchers and faculty registering Mendeley articles," Scientometrics, Springer;Akadémiai Kiadó, vol. 115(2), pages 717-729, May.
    16. Ortega, José Luis, 2018. "The life cycle of altmetric impact: A longitudinal study of six metrics from PlumX," Journal of Informetrics, Elsevier, vol. 12(3), pages 579-589.
    17. Cristina López-Duarte & Marta M. Vidal-Suárez & Belén González-Díaz, 2019. "Cross-national distance and international business: an analysis of the most influential recent models," Scientometrics, Springer;Akadémiai Kiadó, vol. 121(1), pages 173-208, October.
    18. Robin Haunschild & Lutz Bornmann, 2017. "How many scientific papers are mentioned in policy-related documents? An empirical investigation using Web of Science and Altmetric data," Scientometrics, Springer;Akadémiai Kiadó, vol. 110(3), pages 1209-1216, March.
    19. Xuan Zhen Liu & Hui Fang, 2017. "What we can learn from tweets linking to research papers," Scientometrics, Springer;Akadémiai Kiadó, vol. 111(1), pages 349-369, April.
    20. Thelwall, Mike, 2018. "Do females create higher impact research? Scopus citations and Mendeley readers for articles from five countries," Journal of Informetrics, Elsevier, vol. 12(4), pages 1031-1041.

    Corrections

    All material on this site has been provided by the respective publishers and authors. You can help correct errors and omissions. When requesting a correction, please mention this item's handle: RePEc:eee:infome:v:12:y:2018:i:3:p:998-1011. See general information about how to correct material in RePEc.

    If you have authored this item and are not yet registered with RePEc, we encourage you to do it here. This allows to link your profile to this item. It also allows you to accept potential citations to this item that we are uncertain about.

    If CitEc recognized a bibliographic reference but did not link an item in RePEc to it, you can help with this form .

    If you know of missing items citing this one, you can help us creating those links by adding the relevant references in the same way as above, for each refering item. If you are a registered author of this item, you may also want to check the "citations" tab in your RePEc Author Service profile, as there may be some citations waiting for confirmation.

    For technical questions regarding this item, or to correct its authors, title, abstract, bibliographic or download information, contact: Catherine Liu (email available below). General contact details of provider: http://www.elsevier.com/locate/joi .

    Please note that corrections may take a couple of weeks to filter through the various RePEc services.

    IDEAS is a RePEc service. RePEc uses bibliographic data supplied by the respective publishers.