IDEAS home Printed from https://ideas.repec.org/a/spr/scient/v111y2017i2d10.1007_s11192-017-2303-4.html
   My bibliography  Save this article

Contextualization of topics: browsing through the universe of bibliographic information

Author

Listed:
  • Rob Koopman

    (OCLC Research)

  • Shenghui Wang

    (OCLC Research)

  • Andrea Scharnhorst

    (DANS-KNAW)

Abstract

This paper describes how semantic indexing can help to generate a contextual overview of topics and visually compare clusters of articles. The method was originally developed for an innovative information exploration tool, called Ariadne, which operates on bibliographic databases with tens of millions of records (Koopman et al. in Proceedings of the 33rd Annual ACM Conference Extended Abstracts on Human Factors in Computing Systems. doi: 10.1145/2702613.2732781 , 2015b). In this paper, the method behind Ariadne is further developed and applied to the research question of the special issue “Same data, different results”—the better understanding of topic (re-)construction by different bibliometric approaches. For the case of the Astro dataset of 111,616 articles in astronomy and astrophysics, a new instantiation of the interactive exploring tool, LittleAriadne, has been created. This paper contributes to the overall challenge to delineate and define topics in two different ways. First, we produce two clustering solutions based on vector representations of articles in a lexical space. These vectors are built on semantic indexing of entities associated with those articles. Second, we discuss how LittleAriadne can be used to browse through the network of topical terms, authors, journals, citations and various cluster solutions of the Astro dataset. More specifically, we treat the assignment of an article to the different clustering solutions as an additional element of its bibliographic record. Keeping the principle of semantic indexing on the level of such an extended list of entities of the bibliographic record, LittleAriadne in turn provides a visualization of the context of a specific clustering solution. It also conveys the similarity of article clusters produced by different algorithms, hence representing a complementary approach to other possible means of comparison.

Suggested Citation

  • Rob Koopman & Shenghui Wang & Andrea Scharnhorst, 2017. "Contextualization of topics: browsing through the universe of bibliographic information," Scientometrics, Springer;Akadémiai Kiadó, vol. 111(2), pages 1119-1139, May.
  • Handle: RePEc:spr:scient:v:111:y:2017:i:2:d:10.1007_s11192-017-2303-4
    DOI: 10.1007/s11192-017-2303-4
    as

    Download full text from publisher

    File URL: http://link.springer.com/10.1007/s11192-017-2303-4
    File Function: Abstract
    Download Restriction: Access to the full text of the articles in this series is restricted.

    File URL: https://libkey.io/10.1007/s11192-017-2303-4?utm_source=ideas
    LibKey link: if access is restricted and if your library uses this service, LibKey will redirect you to where you can use your library subscription to access this item
    ---><---

    As the access to this document is restricted, you may want to search for a different version of it.

    References listed on IDEAS

    as
    1. Philipp Mayr & Andrea Scharnhorst, 2015. "Scientometrics and information retrieval: weak-links revitalized," Scientometrics, Springer;Akadémiai Kiadó, vol. 102(3), pages 2193-2199, March.
    2. Wolfgang Glänzel & Bart Thijs, 2017. "Using hybrid methods and ‘core documents’ for the representation of clusters and topics: the astronomy dataset," Scientometrics, Springer;Akadémiai Kiadó, vol. 111(2), pages 1071-1087, May.
    3. Michel Zitt & Alain Lelu & Elise Bassecoulard, 2011. "Hybrid citation-word representations in science mapping: Portolan charts of research fields?," Journal of the Association for Information Science & Technology, Association for Information Science & Technology, vol. 62(1), pages 19-39, January.
    4. Kevin W. Boyack, 2017. "Thesaurus-based methods for mapping contents of publication sets," Scientometrics, Springer;Akadémiai Kiadó, vol. 111(2), pages 1141-1155, May.
    5. Theresa Velden & Kevin W. Boyack & Jochen Gläser & Rob Koopman & Andrea Scharnhorst & Shenghui Wang, 2017. "Comparison of topic extraction approaches and their results," Scientometrics, Springer;Akadémiai Kiadó, vol. 111(2), pages 1169-1221, May.
    6. Nees Jan Eck & Ludo Waltman, 2017. "Citation-based clustering of publications using CitNetExplorer and VOSviewer," Scientometrics, Springer;Akadémiai Kiadó, vol. 111(2), pages 1053-1070, May.
    7. Kun Lu & Dietmar Wolfram, 2012. "Measuring author research relatedness: A comparison of word-based, topic-based, and author cocitation approaches," Journal of the Association for Information Science & Technology, Association for Information Science & Technology, vol. 63(10), pages 1973-1986, October.
    8. Kevin W. Boyack, 2017. "Investigating the effect of global data on topic detection," Scientometrics, Springer;Akadémiai Kiadó, vol. 111(2), pages 999-1015, May.
    9. Shenghui Wang & Rob Koopman, 2017. "Clustering articles based on semantic similarity," Scientometrics, Springer;Akadémiai Kiadó, vol. 111(2), pages 1017-1031, May.
    10. Rob Koopman & Shenghui Wang, 2017. "Mutual information based labelling and comparing clusters," Scientometrics, Springer;Akadémiai Kiadó, vol. 111(2), pages 1157-1167, May.
    11. Leydesdorff, Loet & Welbers, Kasper, 2011. "The semantic mapping of words and co-words in contexts," Journal of Informetrics, Elsevier, vol. 5(3), pages 469-475.
    12. Michel Zitt & Alain Lelu & Elise Bassecoulard, 2011. "Hybrid citation‐word representations in science mapping: Portolan charts of research fields?," Journal of the American Society for Information Science and Technology, Association for Information Science & Technology, vol. 62(1), pages 19-39, January.
    13. Kun Lu & Dietmar Wolfram, 2012. "Measuring author research relatedness: A comparison of word‐based, topic‐based, and author cocitation approaches," Journal of the American Society for Information Science and Technology, Association for Information Science & Technology, vol. 63(10), pages 1973-1986, October.
    Full references (including those not matched with items on IDEAS)

    Citations

    Citations are extracted by the CitEc Project, subscribe to its RSS feed for this item.
    as


    Cited by:

    1. Jochen Gläser & Wolfgang Glänzel & Andrea Scharnhorst, 2017. "Same data—different results? Towards a comparative approach to the identification of thematic structures in science," Scientometrics, Springer;Akadémiai Kiadó, vol. 111(2), pages 981-998, May.
    2. Takahiro Kawamura & Katsutaro Watanabe & Naoya Matsumoto & Shusaku Egami & Mari Jibu, 2018. "Funding map using paragraph embedding based on semantic diversity," Scientometrics, Springer;Akadémiai Kiadó, vol. 116(2), pages 941-958, August.
    3. Rob Koopman & Shenghui Wang, 2017. "Mutual information based labelling and comparing clusters," Scientometrics, Springer;Akadémiai Kiadó, vol. 111(2), pages 1157-1167, May.
    4. Mohammed Azmi Al-Betar & Ammar Kamal Abasi & Ghazi Al-Naymat & Kamran Arshad & Sharif Naser Makhadmeh, 2023. "Optimization of scientific publications clustering with ensemble approach for topic extraction," Scientometrics, Springer;Akadémiai Kiadó, vol. 128(5), pages 2819-2877, May.
    5. Shenghui Wang & Rob Koopman, 2017. "Clustering articles based on semantic similarity," Scientometrics, Springer;Akadémiai Kiadó, vol. 111(2), pages 1017-1031, May.
    6. Shiyun Wang & Jin Mao & Yujie Cao & Gang Li, 2022. "Integrated knowledge content in an interdisciplinary field: identification, classification, and application," Scientometrics, Springer;Akadémiai Kiadó, vol. 127(11), pages 6581-6614, November.
    7. Theresa Velden & Kevin W. Boyack & Jochen Gläser & Rob Koopman & Andrea Scharnhorst & Shenghui Wang, 2017. "Comparison of topic extraction approaches and their results," Scientometrics, Springer;Akadémiai Kiadó, vol. 111(2), pages 1169-1221, May.
    8. Theresa Velden & Shiyan Yan & Carl Lagoze, 2017. "Mapping the cognitive structure of astrophysics by infomap clustering of the citation network and topic affinity analysis," Scientometrics, Springer;Akadémiai Kiadó, vol. 111(2), pages 1033-1051, May.
    9. Paul Donner, 2021. "Validation of the Astro dataset clustering solutions with external data," Scientometrics, Springer;Akadémiai Kiadó, vol. 126(2), pages 1619-1645, February.
    10. Peter Sjögårde & Per Ahlgren & Ludo Waltman, 2021. "Algorithmic labeling in hierarchical classifications of publications: Evaluation of bibliographic fields and term weighting approaches," Journal of the Association for Information Science & Technology, Association for Information Science & Technology, vol. 72(7), pages 853-869, July.

    Most related items

    These are the items that most often cite the same works as this one and are cited by the same works as this one.
    1. Theresa Velden & Kevin W. Boyack & Jochen Gläser & Rob Koopman & Andrea Scharnhorst & Shenghui Wang, 2017. "Comparison of topic extraction approaches and their results," Scientometrics, Springer;Akadémiai Kiadó, vol. 111(2), pages 1169-1221, May.
    2. Shuo Xu & Junwan Liu & Dongsheng Zhai & Xin An & Zheng Wang & Hongshen Pang, 2018. "Overlapping thematic structures extraction with mixed-membership stochastic blockmodel," Scientometrics, Springer;Akadémiai Kiadó, vol. 117(1), pages 61-84, October.
    3. Jochen Gläser & Wolfgang Glänzel & Andrea Scharnhorst, 2017. "Same data—different results? Towards a comparative approach to the identification of thematic structures in science," Scientometrics, Springer;Akadémiai Kiadó, vol. 111(2), pages 981-998, May.
    4. Frank Havemann & Jochen Gläser & Michael Heinz, 2017. "Memetic search for overlapping topics based on a local evaluation of link communities," Scientometrics, Springer;Akadémiai Kiadó, vol. 111(2), pages 1089-1118, May.
    5. Paul Donner, 2021. "Validation of the Astro dataset clustering solutions with external data," Scientometrics, Springer;Akadémiai Kiadó, vol. 126(2), pages 1619-1645, February.
    6. Mohammed Azmi Al-Betar & Ammar Kamal Abasi & Ghazi Al-Naymat & Kamran Arshad & Sharif Naser Makhadmeh, 2023. "Optimization of scientific publications clustering with ensemble approach for topic extraction," Scientometrics, Springer;Akadémiai Kiadó, vol. 128(5), pages 2819-2877, May.
    7. Matthias Held & Grit Laudel & Jochen Gläser, 2021. "Challenges to the validity of topic reconstruction," Scientometrics, Springer;Akadémiai Kiadó, vol. 126(5), pages 4511-4536, May.
    8. Ballester, Omar & Penner, Orion, 2022. "Robustness, replicability and scalability in topic modelling," Journal of Informetrics, Elsevier, vol. 16(1).
    9. Sjögårde, Peter & Ahlgren, Per, 2018. "Granularity of algorithmically constructed publication-level classifications of research publications: Identification of topics," Journal of Informetrics, Elsevier, vol. 12(1), pages 133-152.
    10. García-Lillo, Francisco & Seva-Larrosa, Pedro & Sánchez-García, Eduardo, 2023. "What is going on in entrepreneurship research? A bibliometric and SNA analysis," Journal of Business Research, Elsevier, vol. 158(C).
    11. Christian Weismayer & Ilona Pezenka, 2017. "Identifying emerging research fields: a longitudinal latent semantic keyword analysis," Scientometrics, Springer;Akadémiai Kiadó, vol. 113(3), pages 1757-1785, December.
    12. Theresa Velden & Shiyan Yan & Carl Lagoze, 2017. "Mapping the cognitive structure of astrophysics by infomap clustering of the citation network and topic affinity analysis," Scientometrics, Springer;Akadémiai Kiadó, vol. 111(2), pages 1033-1051, May.
    13. Samira Ranaei & Arho Suominen & Alan Porter & Stephen Carley, 2020. "Evaluating technological emergence using text analytics: two case technologies and three approaches," Scientometrics, Springer;Akadémiai Kiadó, vol. 122(1), pages 215-247, January.
    14. Xu, Haiyun & Winnink, Jos & Yue, Zenghui & Zhang, Huiling & Pang, Hongshen, 2021. "Multidimensional Scientometric indicators for the detection of emerging research topics," Technological Forecasting and Social Change, Elsevier, vol. 163(C).
    15. Coccia, Mario & Wang, Lili, 2015. "Path-breaking directions of nanotechnology-based chemotherapy and molecular cancer therapy," Technological Forecasting and Social Change, Elsevier, vol. 94(C), pages 155-169.
    16. Ding, Ying, 2011. "Community detection: Topological vs. topical," Journal of Informetrics, Elsevier, vol. 5(4), pages 498-514.
    17. Yang, Siluo & Han, Ruizhen & Wolfram, Dietmar & Zhao, Yuehua, 2016. "Visualizing the intellectual structure of information science (2006–2015): Introducing author keyword coupling analysis," Journal of Informetrics, Elsevier, vol. 10(1), pages 132-150.
    18. Jun-Ping Qiu & Ke Dong & Hou-Qiang Yu, 2014. "Comparative study on structure and correlation among author co-occurrence networks in bibliometrics," Scientometrics, Springer;Akadémiai Kiadó, vol. 101(2), pages 1345-1360, November.
    19. Michel Zitt, 2015. "Meso-level retrieval: IR-bibliometrics interplay and hybrid citation-words methods in scientific fields delineation," Scientometrics, Springer;Akadémiai Kiadó, vol. 102(3), pages 2223-2245, March.
    20. Lu Huang & Yijie Cai & Erdong Zhao & Shengting Zhang & Yue Shu & Jiao Fan, 2022. "Measuring the interdisciplinarity of Information and Library Science interactions using citation analysis and semantic analysis," Scientometrics, Springer;Akadémiai Kiadó, vol. 127(11), pages 6733-6761, November.

    Corrections

    All material on this site has been provided by the respective publishers and authors. You can help correct errors and omissions. When requesting a correction, please mention this item's handle: RePEc:spr:scient:v:111:y:2017:i:2:d:10.1007_s11192-017-2303-4. See general information about how to correct material in RePEc.

    If you have authored this item and are not yet registered with RePEc, we encourage you to do it here. This allows to link your profile to this item. It also allows you to accept potential citations to this item that we are uncertain about.

    If CitEc recognized a bibliographic reference but did not link an item in RePEc to it, you can help with this form .

    If you know of missing items citing this one, you can help us creating those links by adding the relevant references in the same way as above, for each refering item. If you are a registered author of this item, you may also want to check the "citations" tab in your RePEc Author Service profile, as there may be some citations waiting for confirmation.

    For technical questions regarding this item, or to correct its authors, title, abstract, bibliographic or download information, contact: Sonal Shukla or Springer Nature Abstracting and Indexing (email available below). General contact details of provider: http://www.springer.com .

    Please note that corrections may take a couple of weeks to filter through the various RePEc services.

    IDEAS is a RePEc service. RePEc uses bibliographic data supplied by the respective publishers.