IDEAS home Printed from https://ideas.repec.org/a/spr/scient/v96y2013i3d10.1007_s11192-013-0978-8.html
   My bibliography  Save this article

Author name disambiguation in scientific collaboration and mobility cases

Author

Listed:
  • Jiang Wu

    (Wuhan University)

  • Xiu-Hao Ding

    (Huazhong University of Science and Technology)

Abstract

Scientists generally do scientific collaborations with one another and sometimes change their affiliations, which leads to scientific mobility. This paper proposes a recursive reinforced name disambiguation method that integrates both coauthorship and affiliation information, especially in cases of scientific collaboration and mobility. The proposed method is evaluated using the dataset from the Thomson Reuters Scientific “Web of Science”. The probability of recall and precision of the algorithm are then analyzed. To understand the effect of the name ambiguation on the h-index and g-index before and after the name disambiguation, calculations of their distribution are also presented. Evaluation experiments show that using only the affiliation information in the name disambiguation achieves better performance than that using only the coauthorship information; however, our proposed method that integrates both the coauthorship and affiliation information can control the bias in the name ambiguation to a higher extent.

Suggested Citation

  • Jiang Wu & Xiu-Hao Ding, 2013. "Author name disambiguation in scientific collaboration and mobility cases," Scientometrics, Springer;Akadémiai Kiadó, vol. 96(3), pages 683-697, September.
  • Handle: RePEc:spr:scient:v:96:y:2013:i:3:d:10.1007_s11192-013-0978-8
    DOI: 10.1007/s11192-013-0978-8
    as

    Download full text from publisher

    File URL: http://link.springer.com/10.1007/s11192-013-0978-8
    File Function: Abstract
    Download Restriction: Access to the full text of the articles in this series is restricted.

    File URL: https://libkey.io/10.1007/s11192-013-0978-8?utm_source=ideas
    LibKey link: if access is restricted and if your library uses this service, LibKey will redirect you to where you can use your library subscription to access this item
    ---><---

    As the access to this document is restricted, you may want to search for a different version of it.

    References listed on IDEAS

    as
    1. Juan E. Iglesias & Carlos Pecharromán, 2007. "Scaling the h-index for different scientific ISI fields," Scientometrics, Springer;Akadémiai Kiadó, vol. 73(3), pages 303-320, December.
    2. Steven Wooding & Kate Wilcox-Jay & Grant Lewison & Jonathan Grant, 2006. "Co-author inclusion: A novel recursive algorithmic method for dealingwith homonyms in bibliometric analysis," Scientometrics, Springer;Akadémiai Kiadó, vol. 66(1), pages 11-21, January.
    3. Li Tang & John P. Walsh, 2010. "Bibliometric fingerprints: name disambiguation based on approximate structure equivalence of cognitive maps," Scientometrics, Springer;Akadémiai Kiadó, vol. 84(3), pages 763-784, September.
    4. Thomas Gurney & Edwin Horlings & Peter van den Besselaar, 2012. "Author disambiguation using multi-aspect similarity indicators," Scientometrics, Springer;Akadémiai Kiadó, vol. 91(2), pages 435-449, May.
    5. Raf Guns & Yu Xian Liu & Dilruba Mahbuba, 2011. "Q-measures and betweenness centrality in a collaboration network: a case study of the field of informetrics," Scientometrics, Springer;Akadémiai Kiadó, vol. 87(1), pages 133-147, April.
    6. Natsuo Onodera & Mariko Iwasawa & Nobuyuki Midorikawa & Fuyuki Yoshikane & Kou Amano & Yutaka Ootani & Tadashi Kodama & Yasuhiko Kiyama & Hiroyuki Tsunoda & Shizuka Yamazaki, 2011. "A method for eliminating articles by homonymous authors from the large number of articles retrieved by author search," Journal of the American Society for Information Science and Technology, Association for Information Science & Technology, vol. 62(4), pages 677-690, April.
    7. José M. Soler, 2007. "Separating the articles of authors with the same name," Scientometrics, Springer;Akadémiai Kiadó, vol. 72(2), pages 281-290, August.
    8. Chung Joo Chung & Han Woo Park, 2012. "Web visibility of scholars in media and communication journals," Scientometrics, Springer;Akadémiai Kiadó, vol. 93(1), pages 207-215, October.
    9. Leo Egghe, 2006. "Theory and practise of the g-index," Scientometrics, Springer;Akadémiai Kiadó, vol. 69(1), pages 131-152, October.
    10. Natsuo Onodera & Mariko Iwasawa & Nobuyuki Midorikawa & Fuyuki Yoshikane & Kou Amano & Yutaka Ootani & Tadashi Kodama & Yasuhiko Kiyama & Hiroyuki Tsunoda & Shizuka Yamazaki, 2011. "A method for eliminating articles by homonymous authors from the large number of articles retrieved by author search," Journal of the Association for Information Science & Technology, Association for Information Science & Technology, vol. 62(4), pages 677-690, April.
    11. Dangzhi Zhao & Andreas Strotmann, 2011. "Counting first, last, or all authors in citation analysis: A comprehensive comparison in the highly collaborative stem cell research field," Journal of the American Society for Information Science and Technology, Association for Information Science & Technology, vol. 62(4), pages 654-676, April.
    12. Dangzhi Zhao & Andreas Strotmann, 2011. "Counting first, last, or all authors in citation analysis: A comprehensive comparison in the highly collaborative stem cell research field," Journal of the Association for Information Science & Technology, Association for Information Science & Technology, vol. 62(4), pages 654-676, April.
    Full references (including those not matched with items on IDEAS)

    Citations

    Citations are extracted by the CitEc Project, subscribe to its RSS feed for this item.
    as


    Cited by:

    1. Jia Zhu & Yi Yang & Qing Xie & Liwei Wang & Saeed-Ul Hassan, 2014. "Robust hybrid name disambiguation framework for large databases," Scientometrics, Springer;Akadémiai Kiadó, vol. 98(3), pages 2255-2274, March.
    2. Jia Zhu & Xingcheng Wu & Xueqin Lin & Changqin Huang & Gabriel Pui Cheong Fung & Yong Tang, 2018. "A novel multiple layers name disambiguation framework for digital libraries using dynamic clustering," Scientometrics, Springer;Akadémiai Kiadó, vol. 114(3), pages 781-794, March.
    3. Wang, Zhiqi & Chen, Yue & Glänzel, Wolfgang, 2020. "Preprints as accelerator of scholarly communication: An empirical analysis in Mathematics," Journal of Informetrics, Elsevier, vol. 14(4).
    4. Li Zhang & Wei Lu & Jinqing Yang, 2023. "LAGOS‐AND: A large gold standard dataset for scholarly author name disambiguation," Journal of the Association for Information Science & Technology, Association for Information Science & Technology, vol. 74(2), pages 168-185, February.
    5. Andrea Ancona & Roy Cerqueti & Gianluca Vagnani, 2023. "A novel methodology to disambiguate organization names: an application to EU Framework Programmes data," Scientometrics, Springer;Akadémiai Kiadó, vol. 128(8), pages 4447-4474, August.
    6. Jiang Wu & Miao Jin & Xiu-Hao Ding, 2015. "Diversity of individual research disciplines in scientific funding," Scientometrics, Springer;Akadémiai Kiadó, vol. 103(2), pages 669-686, May.
    7. Hao Wu & Bo Li & Yijian Pei & Jun He, 2014. "Unsupervised author disambiguation using Dempster–Shafer theory," Scientometrics, Springer;Akadémiai Kiadó, vol. 101(3), pages 1955-1972, December.
    8. Omar Hernando Avila-Poveda, 2014. "Technical report: the trend of author compound names and its implications for authorship identity identification," Scientometrics, Springer;Akadémiai Kiadó, vol. 101(1), pages 833-846, October.
    9. Dongwook Shin & Taehwan Kim & Joongmin Choi & Jungsun Kim, 2014. "Author name disambiguation using a graph model with node splitting and merging based on bibliographic information," Scientometrics, Springer;Akadémiai Kiadó, vol. 100(1), pages 15-50, July.
    10. Jinseok Kim & Jenna Kim & Jason Owen‐Smith, 2021. "Ethnicity‐based name partitioning for author name disambiguation using supervised machine learning," Journal of the Association for Information Science & Technology, Association for Information Science & Technology, vol. 72(8), pages 979-994, August.
    11. Vittorio Fuccella & Domenico De Stefano & Maria Prosperina Vitale & Susanna Zaccarin, 2016. "Improving co-authorship network structures by combining multiple data sources: evidence from Italian academic statisticians," Scientometrics, Springer;Akadémiai Kiadó, vol. 107(1), pages 167-184, April.
    12. Jan Schulz, 2016. "Using Monte Carlo simulations to assess the impact of author name disambiguation quality on different bibliometric analyses," Scientometrics, Springer;Akadémiai Kiadó, vol. 107(3), pages 1283-1298, June.
    13. Jinseok Kim & Jenna Kim, 2020. "Effect of forename string on author name disambiguation," Journal of the Association for Information Science & Technology, Association for Information Science & Technology, vol. 71(7), pages 839-855, July.
    14. Jelena Smiljanić & Arnab Chatterjee & Tomi Kauppinen & Marija Mitrović Dankulov, 2016. "A Theoretical Model for the Associative Nature of Conference Participation," PLOS ONE, Public Library of Science, vol. 11(2), pages 1-12, February.

    Most related items

    These are the items that most often cite the same works as this one and are cited by the same works as this one.
    1. Rehs, Andreas, 2021. "A supervised machine learning approach to author disambiguation in the Web of Science," Journal of Informetrics, Elsevier, vol. 15(3).
    2. Jian Wang & Kaspars Berzins & Diana Hicks & Julia Melkers & Fang Xiao & Diogo Pinheiro, 2012. "A boosted-trees method for name disambiguation," Scientometrics, Springer;Akadémiai Kiadó, vol. 93(2), pages 391-411, November.
    3. Omar Hernando Avila-Poveda, 2014. "Technical report: the trend of author compound names and its implications for authorship identity identification," Scientometrics, Springer;Akadémiai Kiadó, vol. 101(1), pages 833-846, October.
    4. Wu, Jiang, 2013. "Investigating the universal distributions of normalized indicators and developing field-independent index," Journal of Informetrics, Elsevier, vol. 7(1), pages 63-71.
    5. Thomas Gurney & Edwin Horlings & Peter van den Besselaar, 2012. "Author disambiguation using multi-aspect similarity indicators," Scientometrics, Springer;Akadémiai Kiadó, vol. 91(2), pages 435-449, May.
    6. Perc, Matjaž, 2010. "Zipf’s law and log-normal distributions in measures of scientific output across fields and institutions: 40 years of Slovenia’s research as an example," Journal of Informetrics, Elsevier, vol. 4(3), pages 358-364.
    7. Maziar Montazerian & Edgar Dutra Zanotto & Hellmut Eckert, 2019. "A new parameter for (normalized) evaluation of H-index: countries as a case study," Scientometrics, Springer;Akadémiai Kiadó, vol. 118(3), pages 1065-1078, March.
    8. Yang, Siluo & Han, Ruizhen & Wolfram, Dietmar & Zhao, Yuehua, 2016. "Visualizing the intellectual structure of information science (2006–2015): Introducing author keyword coupling analysis," Journal of Informetrics, Elsevier, vol. 10(1), pages 132-150.
    9. Kim, Ha Jin & Jeong, Yoo Kyung & Song, Min, 2016. "Content- and proximity-based author co-citation analysis using citation sentences," Journal of Informetrics, Elsevier, vol. 10(4), pages 954-966.
    10. Xuan Zhen Liu & Hui Fang, 2014. "Scientific group leaders’ authorship preferences: an empirical investigation," Scientometrics, Springer;Akadémiai Kiadó, vol. 98(2), pages 909-925, February.
    11. Perianes-Rodriguez, Antonio & Waltman, Ludo & van Eck, Nees Jan, 2016. "Constructing bibliometric networks: A comparison between full and fractional counting," Journal of Informetrics, Elsevier, vol. 10(4), pages 1178-1195.
    12. Lorna Wildgaard & Jesper W. Schneider & Birger Larsen, 2014. "A review of the characteristics of 108 author-level bibliometric indicators," Scientometrics, Springer;Akadémiai Kiadó, vol. 101(1), pages 125-158, October.
    13. Miguel A. García-Pérez, 2009. "A multidimensional extension to Hirsch’s h-index," Scientometrics, Springer;Akadémiai Kiadó, vol. 81(3), pages 779-785, December.
    14. Koski, Timo & Sandström, Erik & Sandström, Ulf, 2016. "Towards field-adjusted production: Estimating research productivity from a zero-truncated distribution," Journal of Informetrics, Elsevier, vol. 10(4), pages 1143-1152.
    15. R. Álvarez & E. Cahué & J. Clemente-Gallardo & A. Ferrer & D. Íñiguez & X. Mellado & A. Rivero & G. Ruiz & F. Sanz & E. Serrano & A. Tarancón & Y. Vergara, 2015. "Analysis of academic productivity based on Complex Networks," Scientometrics, Springer;Akadémiai Kiadó, vol. 104(3), pages 651-672, September.
    16. J. W. Fedderke, 2013. "The objectivity of national research foundation peer review in South Africa assessed against bibliometric indexes," Scientometrics, Springer;Akadémiai Kiadó, vol. 97(2), pages 177-206, November.
    17. Yezhu Wang & Yundong Xie & Dong Wang & Lu Guo & Rongting Zhou, 2022. "Do cover papers get better citations and usage counts? An analysis of 42 journals in cell biology," Scientometrics, Springer;Akadémiai Kiadó, vol. 127(7), pages 3793-3813, July.
    18. Wei, Shelia X. & Tong, Tong & Rousseau, Ronald & Wang, Wanru & Ye, Fred Y., 2022. "Relations among the h-, g-, ψ-, and p-index and offset-ability," Journal of Informetrics, Elsevier, vol. 16(4).
    19. S. Alonso & F. J. Cabrerizo & E. Herrera-Viedma & F. Herrera, 2010. "hg-index: a new index to characterize the scientific output of researchers based on the h- and g-indices," Scientometrics, Springer;Akadémiai Kiadó, vol. 82(2), pages 391-400, February.
    20. Jan Schulz, 2016. "Using Monte Carlo simulations to assess the impact of author name disambiguation quality on different bibliometric analyses," Scientometrics, Springer;Akadémiai Kiadó, vol. 107(3), pages 1283-1298, June.

    Corrections

    All material on this site has been provided by the respective publishers and authors. You can help correct errors and omissions. When requesting a correction, please mention this item's handle: RePEc:spr:scient:v:96:y:2013:i:3:d:10.1007_s11192-013-0978-8. See general information about how to correct material in RePEc.

    If you have authored this item and are not yet registered with RePEc, we encourage you to do it here. This allows to link your profile to this item. It also allows you to accept potential citations to this item that we are uncertain about.

    If CitEc recognized a bibliographic reference but did not link an item in RePEc to it, you can help with this form .

    If you know of missing items citing this one, you can help us creating those links by adding the relevant references in the same way as above, for each refering item. If you are a registered author of this item, you may also want to check the "citations" tab in your RePEc Author Service profile, as there may be some citations waiting for confirmation.

    For technical questions regarding this item, or to correct its authors, title, abstract, bibliographic or download information, contact: Sonal Shukla or Springer Nature Abstracting and Indexing (email available below). General contact details of provider: http://www.springer.com .

    Please note that corrections may take a couple of weeks to filter through the various RePEc services.

    IDEAS is a RePEc service. RePEc uses bibliographic data supplied by the respective publishers.