IDEAS home Printed from https://ideas.repec.org/a/spr/scient/v127y2022i11d10.1007_s11192-022-04525-0.html
   My bibliography  Save this article

The quality of the web of science data: a longitudinal study on the completeness of authors-addresses links

Author

Listed:
  • Abdelghani Maddi

    (Observatoire des Sciences et Techniques, Hcéres)

  • Lesya Baudoin

    (Observatoire des Sciences et Techniques, Hcéres
    Institut Pour La Recherche en Santé Publique BIOPARK)

Abstract

The author-affiliation links are the essential elements used for multiple purposes, such as the disambiguation of authors, the attribution of credits of a publication and fractional counting, the analysis of scientific networks, etc. In this article we analyzed the author-affiliation link quality in the Web of Science (WoS) database between 2000 and 2021. We analyzed the link completeness for 32,676,914 scientific publications under different angles: WoS index, document type and the number of authors per publication. The analysis showed that the author-affiliation link begins to be well informed from 2008. The share of publications for which all addresses and all authors are linked is close to 100% from 2016. The results show a strong variability according to the WoS index, the document type and the number of authors per publication. AHCI is the index with the highest completeness rate, unlike the SCI. For the document type, these are the Conference proceedings where the completeness rate is better and/or can be completed. Regarding the number of authors, statistics show that the higher the number, the more addresses and unlinked authors there are finally, the analysis of a random sample of 100 publications showed that in more than 50% of the cases, the author-address links do not exist in the original publication, and the WoS reproduced only the available information provided by the editor.

Suggested Citation

  • Abdelghani Maddi & Lesya Baudoin, 2022. "The quality of the web of science data: a longitudinal study on the completeness of authors-addresses links," Scientometrics, Springer;Akadémiai Kiadó, vol. 127(11), pages 6279-6292, November.
  • Handle: RePEc:spr:scient:v:127:y:2022:i:11:d:10.1007_s11192-022-04525-0
    DOI: 10.1007/s11192-022-04525-0
    as

    Download full text from publisher

    File URL: http://link.springer.com/10.1007/s11192-022-04525-0
    File Function: Abstract
    Download Restriction: Access to the full text of the articles in this series is restricted.

    File URL: https://libkey.io/10.1007/s11192-022-04525-0?utm_source=ideas
    LibKey link: if access is restricted and if your library uses this service, LibKey will redirect you to where you can use your library subscription to access this item
    ---><---

    As the access to this document is restricted, you may want to search for a different version of it.

    References listed on IDEAS

    as
    1. Linda Reijnhoudt & Rodrigo Costas & Ed Noyons & Katy Börner & Andrea Scharnhorst, 2014. "‘Seed + expand’: a general methodology for detecting publication oeuvres of individual researchers," Scientometrics, Springer;Akadémiai Kiadó, vol. 101(2), pages 1403-1417, November.
    2. Wuestman, Mignon L. & Hoekman, Jarno & Frenken, Koen, 2019. "The geography of scientific citations," Research Policy, Elsevier, vol. 48(7), pages 1771-1780.
    3. Li Tang & John P. Walsh, 2010. "Bibliometric fingerprints: name disambiguation based on approximate structure equivalence of cognitive maps," Scientometrics, Springer;Akadémiai Kiadó, vol. 84(3), pages 763-784, September.
    4. Jielan Ding & Zhesi Shen & Per Ahlgren & Tobias Jeppsson & David Minguillo & Johan Lyhagen, 2021. "The link between ethnic diversity and scientific impact: the mediating effect of novelty and audience diversity," Scientometrics, Springer;Akadémiai Kiadó, vol. 126(9), pages 7759-7810, September.
    5. Helmut A. Abt, 2007. "The future of single-authored papers," Scientometrics, Springer;Akadémiai Kiadó, vol. 73(3), pages 353-358, December.
    6. Abramo, Giovanni & D’Angelo, Ciriaco Andrea & Di Costa, Flavia, 2020. "The role of geographical proximity in knowledge diffusion, measured by citations to scientific literature," Journal of Informetrics, Elsevier, vol. 14(1).
    7. Enrique Orduna-Malea & Selenay Aytac & Clara Y. Tran, 2019. "Universities through the eyes of bibliographic databases: a retroactive growth comparison of Google Scholar, Scopus and Web of Science," Scientometrics, Springer;Akadémiai Kiadó, vol. 121(1), pages 433-450, October.
    8. Henk F. Moed & M’hamed Aisati & Andrew Plume, 2013. "Studying scientific migration in Scopus," Scientometrics, Springer;Akadémiai Kiadó, vol. 94(3), pages 929-942, March.
    9. Giovanni Abramo & Ciriaco Andrea D’Angelo, 2020. "The domestic localization of knowledge flows as evidenced by publication citation: the case of Italy," Scientometrics, Springer;Akadémiai Kiadó, vol. 125(2), pages 1305-1329, November.
    10. Mark-Christoph Müller & Florian Reitz & Nicolas Roy, 2017. "Data sets for author name disambiguation: an empirical analysis and a new resource," Scientometrics, Springer;Akadémiai Kiadó, vol. 111(3), pages 1467-1500, June.
    11. Junming Huang & Alexander J. Gates & Roberta Sinatra & Albert-László Barabási, 2020. "Historical comparison of gender inequality in scientific careers across countries and disciplines," Proceedings of the National Academy of Sciences, Proceedings of the National Academy of Sciences, vol. 117(9), pages 4609-4616, March.
    12. Abramo, Giovanni & D’Angelo, Ciriaco Andrea & Di Costa, Flavia, 2021. "On the relation between the degree of internationalization of cited and citing publications: A field level analysis, including and excluding self-citations," Journal of Informetrics, Elsevier, vol. 15(1).
    13. Gorraiz, Juan & Melero-Fuentes, David & Gumpenberger, Christian & Valderrama-Zurián, Juan-Carlos, 2016. "Availability of digital object identifiers (DOIs) in Web of Science and Scopus," Journal of Informetrics, Elsevier, vol. 10(1), pages 98-109.
    14. Paul Donner, 2017. "Document type assignment accuracy in the journal citation index data of Web of Science," Scientometrics, Springer;Akadémiai Kiadó, vol. 113(1), pages 219-236, October.
    15. Mott Greene, 2007. "The demise of the lone author," Nature, Nature, vol. 450(7173), pages 1165-1165, December.
    16. Franceschini, Fiorenzo & Maisano, Domenico & Mastrogiacomo, Luca, 2014. "Scientific journal publishers and omitted citations in bibliometric databases: Any relationship?," Journal of Informetrics, Elsevier, vol. 8(3), pages 751-765.
    17. Vincent Larivière & Yves Gingras & Cassidy R. Sugimoto & Andrew Tsou, 2015. "Team size matters: Collaboration and scientific impact since 1900," Journal of the Association for Information Science & Technology, Association for Information Science & Technology, vol. 66(7), pages 1323-1332, July.
    18. Frenken, Koen & Hardeman, Sjoerd & Hoekman, Jarno, 2009. "Spatial scientometrics: Towards a cumulative research program," Journal of Informetrics, Elsevier, vol. 3(3), pages 222-232.
    19. Pedro Albarrán & Raquel Carrasco & Javier Ruiz-Castillo, 2017. "Geographic mobility and research productivity in a selection of top world economics departments," Scientometrics, Springer;Akadémiai Kiadó, vol. 111(1), pages 241-265, April.
    20. Weeks, William B & Wallace, Amy E & Kimberly, B.C.Surott, 2004. "Changes in authorship patterns in prestigious US medical journals," Social Science & Medicine, Elsevier, vol. 59(9), pages 1949-1954, November.
    21. Franceschini, Fiorenzo & Maisano, Domenico & Mastrogiacomo, Luca, 2016. "Empirical analysis and classification of database errors in Scopus and Web of Science," Journal of Informetrics, Elsevier, vol. 10(4), pages 933-953.
    22. Liu, Weishu & Hu, Guangyuan & Tang, Li, 2018. "Missing author address information in Web of Science—An explorative study," Journal of Informetrics, Elsevier, vol. 12(3), pages 985-997.
    23. João Carlos Nabout & Micael Rosa Parreira & Fabrício Barreto Teresa & Fernanda Melo Carneiro & Hélida Ferreira Cunha & Luciana Souza Ondei & Samantha Salomão Caramori & Thannya Nascimento Soares, 2015. "Publish (in a group) or perish (alone): the trend from single- to multi-authorship in biological papers," Scientometrics, Springer;Akadémiai Kiadó, vol. 102(1), pages 357-364, January.
    24. Helmut A. Abt, 2017. "Citations and author numbers in six sciences," Scientometrics, Springer;Akadémiai Kiadó, vol. 111(3), pages 1861-1867, June.
    25. Waltman, Ludo & Tijssen, Robert J.W. & Eck, Nees Jan van, 2011. "Globalisation of science in kilometres," Journal of Informetrics, Elsevier, vol. 5(4), pages 574-582.
    26. Shuo Xu & Liyuan Hao & Xin An & Dongsheng Zhai & Hongshen Pang, 2019. "Types of DOI errors of cited references in Web of Science with a cleaning method," Scientometrics, Springer;Akadémiai Kiadó, vol. 120(3), pages 1427-1437, September.
    27. Abramo, Giovanni & D’Angelo, Ciriaco Andrea & Di Costa, Flavia, 2020. "Knowledge spillovers: Does the geographic proximity effect decay over time? A discipline-level analysis, accounting for cognitive proximity, with and without self-citations," Journal of Informetrics, Elsevier, vol. 14(4).
    Full references (including those not matched with items on IDEAS)

    Most related items

    These are the items that most often cite the same works as this one and are cited by the same works as this one.
    1. Raminta Pranckutė, 2021. "Web of Science (WoS) and Scopus: The Titans of Bibliographic Information in Today’s Academic World," Publications, MDPI, vol. 9(1), pages 1-59, March.
    2. Weishu Liu & Meiting Huang & Haifeng Wang, 2021. "Same journal but different numbers of published records indexed in Scopus and Web of Science Core Collection: causes, consequences, and solutions," Scientometrics, Springer;Akadémiai Kiadó, vol. 126(5), pages 4541-4550, May.
    3. Abramo, Giovanni & D’Angelo, Ciriaco Andrea & Di Costa, Flavia, 2020. "Knowledge spillovers: Does the geographic proximity effect decay over time? A discipline-level analysis, accounting for cognitive proximity, with and without self-citations," Journal of Informetrics, Elsevier, vol. 14(4).
    4. Shirley Ainsworth & Jane M. Russell, 2018. "Has hosting on science direct improved the visibility of Latin American scholarly journals? A preliminary analysis of data quality," Scientometrics, Springer;Akadémiai Kiadó, vol. 115(3), pages 1463-1484, June.
    5. Liu, Weishu & Hu, Guangyuan & Tang, Li, 2018. "Missing author address information in Web of Science—An explorative study," Journal of Informetrics, Elsevier, vol. 12(3), pages 985-997.
    6. Houqiang Yu & Xueting Cao & Tingting Xiao & Zhenyi Yang, 2020. "How accurate are policy document mentions? A first look at the role of altmetrics database," Scientometrics, Springer;Akadémiai Kiadó, vol. 125(2), pages 1517-1540, November.
    7. Gerson Pech & Catarina Delgado, 2020. "Assessing the publication impact using citation data from both Scopus and WoS databases: an approach validated in 15 research fields," Scientometrics, Springer;Akadémiai Kiadó, vol. 125(2), pages 909-924, November.
    8. Giovanni Abramo & Francesca Apponi & Ciriaco Andrea D’Angelo, 2022. "The geographic proximity effect on domestic cross-sector vis-à-vis intra-sector research collaborations," Scientometrics, Springer;Akadémiai Kiadó, vol. 127(6), pages 3505-3521, June.
    9. Waltman, Ludo, 2016. "A review of the literature on citation impact indicators," Journal of Informetrics, Elsevier, vol. 10(2), pages 365-391.
    10. Igor Savchenko & Denis Kosyakov, 2022. "Lost in affiliation: apatride publications in international databases," Scientometrics, Springer;Akadémiai Kiadó, vol. 127(6), pages 3471-3487, June.
    11. Shuo Xu & Liyuan Hao & Xin An & Dongsheng Zhai & Hongshen Pang, 2019. "Types of DOI errors of cited references in Web of Science with a cleaning method," Scientometrics, Springer;Akadémiai Kiadó, vol. 120(3), pages 1427-1437, September.
    12. Abramo, Giovanni & D’Angelo, Ciriaco Andrea & Di Costa, Flavia, 2020. "The role of geographical proximity in knowledge diffusion, measured by citations to scientific literature," Journal of Informetrics, Elsevier, vol. 14(1).
    13. Giovanni Abramo & Francesca Apponi & Ciriaco Andrea D'Angelo, 2022. "The geographic proximity effect on domestic cross-sector vis-a-vis intra-sector research collaborations," Papers 2202.10347, arXiv.org.
    14. Junwen Zhu & Fang Liu & Weishu Liu, 2019. "The secrets behind Web of Science’s DOI search," Scientometrics, Springer;Akadémiai Kiadó, vol. 119(3), pages 1745-1753, June.
    15. Giovanni Abramo & Ciriaco Andrea D’Angelo & Flavia Costa, 2020. "Does the geographic proximity effect on knowledge spillovers vary across research fields?," Scientometrics, Springer;Akadémiai Kiadó, vol. 123(2), pages 1021-1036, May.
    16. Marek Kwiek & Wojciech Roszka, 2022. "Are female scientists less inclined to publish alone? The gender solo research gap," Scientometrics, Springer;Akadémiai Kiadó, vol. 127(4), pages 1697-1735, April.
    17. Abramo, Giovanni & D’Angelo, Ciriaco Andrea & Di Costa, Flavia, 2021. "On the relation between the degree of internationalization of cited and citing publications: A field level analysis, including and excluding self-citations," Journal of Informetrics, Elsevier, vol. 15(1).
    18. Abramo, Giovanni & D'Angelo, Ciriaco Andrea & Di Costa, Flavia, 2019. "Diversification versus specialization in scientific research: Which strategy pays off?," Technovation, Elsevier, vol. 82, pages 51-57.
    19. Abramo, Giovanni & D'Angelo, Ciriaco Andrea & Di Costa, Flavia, 2021. "The scholarly impact of private sector research: A multivariate analysis," Journal of Informetrics, Elsevier, vol. 15(3).
    20. Josh Yamamoto & Eitan Frachtenberg, 2022. "Gender Differences in Collaboration Patterns in Computer Science," Publications, MDPI, vol. 10(1), pages 1-21, February.

    More about this item

    Keywords

    Web of science; Metadata; Authors-addresses links; Data quality; Bibliometrics;
    All these keywords.

    JEL classification:

    • C8 - Mathematical and Quantitative Methods - - Data Collection and Data Estimation Methodology; Computer Programs
    • Y1 - Miscellaneous Categories - - Data: Tables and Charts
    • D8 - Microeconomics - - Information, Knowledge, and Uncertainty

    Statistics

    Access and download statistics

    Corrections

    All material on this site has been provided by the respective publishers and authors. You can help correct errors and omissions. When requesting a correction, please mention this item's handle: RePEc:spr:scient:v:127:y:2022:i:11:d:10.1007_s11192-022-04525-0. See general information about how to correct material in RePEc.

    If you have authored this item and are not yet registered with RePEc, we encourage you to do it here. This allows to link your profile to this item. It also allows you to accept potential citations to this item that we are uncertain about.

    If CitEc recognized a bibliographic reference but did not link an item in RePEc to it, you can help with this form .

    If you know of missing items citing this one, you can help us creating those links by adding the relevant references in the same way as above, for each refering item. If you are a registered author of this item, you may also want to check the "citations" tab in your RePEc Author Service profile, as there may be some citations waiting for confirmation.

    For technical questions regarding this item, or to correct its authors, title, abstract, bibliographic or download information, contact: Sonal Shukla or Springer Nature Abstracting and Indexing (email available below). General contact details of provider: http://www.springer.com .

    Please note that corrections may take a couple of weeks to filter through the various RePEc services.

    IDEAS is a RePEc service. RePEc uses bibliographic data supplied by the respective publishers.