IDEAS home Printed from https://ideas.repec.org/a/eee/infome/v12y2018i1p133-152.html
   My bibliography  Save this article

Granularity of algorithmically constructed publication-level classifications of research publications: Identification of topics

Author

Listed:
  • Sjögårde, Peter
  • Ahlgren, Per

Abstract

The purpose of this study is to find a theoretically grounded, practically applicable and useful granularity level of an algorithmically constructed publication-level classification of research publications (ACPLC). The level addressed is the level of research topics. The methodology we propose uses synthesis papers and their reference articles to construct a baseline classification. A dataset of about 31 million publications, and their mutual citations relations, is used to obtain several ACPLCs of different granularity. Each ACPLC is compared to the baseline classification and the best performing ACPLC is identified. The results of two case studies show that the topics of the cases are closely associated with different classes of the identified ACPLC, and that these classes tend to treat only one topic. Further, the class size variation is moderate, and only a small proportion of the publications belong to very small classes. For these reasons, we conclude that the proposed methodology is suitable to determine the topic granularity level of an ACPLC and that the ACPLC identified by this methodology is useful for bibliometric analyses.

Suggested Citation

  • Sjögårde, Peter & Ahlgren, Per, 2018. "Granularity of algorithmically constructed publication-level classifications of research publications: Identification of topics," Journal of Informetrics, Elsevier, vol. 12(1), pages 133-152.
  • Handle: RePEc:eee:infome:v:12:y:2018:i:1:p:133-152
    DOI: 10.1016/j.joi.2017.12.006
    as

    Download full text from publisher

    File URL: http://www.sciencedirect.com/science/article/pii/S1751157717303371
    Download Restriction: Full text for ScienceDirect subscribers only

    File URL: https://libkey.io/10.1016/j.joi.2017.12.006?utm_source=ideas
    LibKey link: if access is restricted and if your library uses this service, LibKey will redirect you to where you can use your library subscription to access this item
    ---><---

    As the access to this document is restricted, you may want to search for a different version of it.

    References listed on IDEAS

    as
    1. Zhang, Lin & Liu, Xinhai & Janssens, Frizo & Liang, Liming & Glänzel, Wolfgang, 2010. "Subject clustering analysis based on ISI category classification," Journal of Informetrics, Elsevier, vol. 4(2), pages 185-193.
    2. Lawrence Hubert & Phipps Arabie, 1985. "Comparing partitions," Journal of Classification, Springer;The Classification Society, vol. 2(1), pages 193-218, December.
    3. Cristian Colliander, 2015. "A novel approach to citation normalization: A similarity-based method for creating reference sets," Journal of the Association for Information Science & Technology, Association for Information Science & Technology, vol. 66(3), pages 489-500, March.
    4. Ismael Rafols & Alan L. Porter & Loet Leydesdorff, 2010. "Science overlay maps: A new tool for research policy and library management," Journal of the Association for Information Science & Technology, Association for Information Science & Technology, vol. 61(9), pages 1871-1887, September.
    5. Peter van den Besselaar & Gaston Heimeriks, 2006. "Mapping research topics using word-reference co-occurrences: A method and an exploratory case study," Scientometrics, Springer;Akadémiai Kiadó, vol. 68(3), pages 377-393, September.
    6. Loet Leydesdorff & Lutz Bornmann & Caroline S. Wagner, 2017. "Generating clustered journal maps: an automated system for hierarchical classification," Scientometrics, Springer;Akadémiai Kiadó, vol. 110(3), pages 1601-1614, March.
    7. Erjia Yan & Ying Ding & Elin K. Jacob, 2012. "Overlaying communities and topics: an analysis on publication networks," Scientometrics, Springer;Akadémiai Kiadó, vol. 90(2), pages 499-513, February.
    8. Small, Henry & Boyack, Kevin W. & Klavans, Richard, 2014. "Identifying emerging topics in science and technology," Research Policy, Elsevier, vol. 43(8), pages 1450-1467.
    9. Yan, Erjia, 2014. "Research dynamics: Measuring the continuity and popularity of research topics," Journal of Informetrics, Elsevier, vol. 8(1), pages 98-110.
    10. Boyack, Kevin W. & Klavans, Richard, 2014. "Including cited non-source items in a large-scale map of science: What difference does it make?," Journal of Informetrics, Elsevier, vol. 8(3), pages 569-580.
    11. Douglas Henrique Milanez & Ed Noyons & Leandro Innocentini Lopes Faria, 2016. "A delineating procedure to retrieve relevant publication data in research areas: the case of nanocellulose," Scientometrics, Springer;Akadémiai Kiadó, vol. 107(2), pages 627-643, May.
    12. Kevin W. Boyack & Richard Klavans & Katy Börner, 2005. "Mapping the backbone of science," Scientometrics, Springer;Akadémiai Kiadó, vol. 64(3), pages 351-374, August.
    13. Ludo Waltman & Nees Eck, 2013. "A smart local moving algorithm for large-scale modularity-based community detection," The European Physical Journal B: Condensed Matter and Complex Systems, Springer;EDP Sciences, vol. 86(11), pages 1-14, November.
    14. Joachim Schummer, 2004. "Multidisciplinarity, interdisciplinarity, and patterns of research collaboration in nanoscience and nanotechnology," Scientometrics, Springer;Akadémiai Kiadó, vol. 59(3), pages 425-465, March.
    15. Richard Klavans & Kevin W. Boyack, 2011. "Using global mapping to create more accurate document-level maps of research fields," Journal of the Association for Information Science & Technology, Association for Information Science & Technology, vol. 62(1), pages 1-18, January.
    16. Kevin W. Boyack & Richard Klavans, 2014. "Creation of a highly detailed, dynamic, global model and map of science," Journal of the Association for Information Science & Technology, Association for Information Science & Technology, vol. 65(4), pages 670-685, April.
    17. Min Song & Go Eun Heo & Su Yeon Kim, 2014. "Analyzing topic evolution in bioinformatics: investigation of dynamics of the field with conference data in DBLP," Scientometrics, Springer;Akadémiai Kiadó, vol. 101(1), pages 397-428, October.
    18. Perianes-Rodriguez, Antonio & Ruiz-Castillo, Javier, 2017. "A comparison of the Web of Science and publication-level classification systems of science," Journal of Informetrics, Elsevier, vol. 11(1), pages 32-45.
    19. Kevin W. Boyack, 2017. "Investigating the effect of global data on topic detection," Scientometrics, Springer;Akadémiai Kiadó, vol. 111(2), pages 999-1015, May.
    20. Kevin W. Boyack & Richard Klavans, 2010. "Co-citation analysis, bibliographic coupling, and direct citation: Which citation approach represents the research front most accurately?," Journal of the Association for Information Science & Technology, Association for Information Science & Technology, vol. 61(12), pages 2389-2404, December.
    21. S. Phineas Upham & Henry Small, 2010. "Emerging research fronts in science and technology: patterns of new knowledge development," Scientometrics, Springer;Akadémiai Kiadó, vol. 83(1), pages 15-38, April.
    22. Ludo Waltman & Nees Jan Eck, 2012. "A new methodology for constructing a publication-level classification system of science," Journal of the Association for Information Science & Technology, Association for Information Science & Technology, vol. 63(12), pages 2378-2392, December.
    23. Richard Klavans & Kevin W. Boyack, 2017. "Which Type of Citation Analysis Generates the Most Accurate Taxonomy of Scientific and Technical Knowledge?," Journal of the Association for Information Science & Technology, Association for Information Science & Technology, vol. 68(4), pages 984-998, April.
    24. Wolfgang Glänzel & Bart Thijs, 2017. "Using hybrid methods and ‘core documents’ for the representation of clusters and topics: the astronomy dataset," Scientometrics, Springer;Akadémiai Kiadó, vol. 111(2), pages 1071-1087, May.
    25. Ahlgren, Per & Colliander, Cristian, 2009. "Document–document similarity approaches and science mapping: Experimental comparison of five approaches," Journal of Informetrics, Elsevier, vol. 3(1), pages 49-63.
    26. Waltman, Ludo & van Eck, Nees Jan & van Leeuwen, Thed N. & Visser, Martijn S., 2013. "Some modifications to the SNIP journal impact indicator," Journal of Informetrics, Elsevier, vol. 7(2), pages 272-285.
    27. Bei Wen & Edwin Horlings & Mariëlle van der Zouwen & Peter van den Besselaar, 2017. "Mapping science through bibliometric triangulation: An experimental approach applied to water research," Journal of the Association for Information Science & Technology, Association for Information Science & Technology, vol. 68(3), pages 724-738, March.
    28. W. Glänzel & A. Schubert & U. Schoepflin & H. J. Czerwon, 1999. "An item-by-item subject classification of papers published in journals covered by the SSCI database using reference analysis," Scientometrics, Springer;Akadémiai Kiadó, vol. 46(3), pages 431-441, November.
    29. W. Glänzel & A. Schubert & H. -J. Czerwon, 1999. "An item-by-item subject classification of papers published in multidisciplinary and general journals using reference analysis," Scientometrics, Springer;Akadémiai Kiadó, vol. 44(3), pages 427-439, March.
    30. Ludo Waltman & Nees Jan Eck, 2013. "Source normalized indicators of citation impact: an overview of different approaches and an empirical comparison," Scientometrics, Springer;Akadémiai Kiadó, vol. 96(3), pages 699-716, September.
    31. Yan, Erjia & Ding, Ying & Milojević, Staša & Sugimoto, Cassidy R., 2012. "Topics in dynamic research communities: An exploratory study for the field of information retrieval," Journal of Informetrics, Elsevier, vol. 6(1), pages 140-153.
    32. Wolfgang Glänzel & András Schubert, 2003. "A new classification scheme of science fields and subfields designed for scientometric evaluation purposes," Scientometrics, Springer;Akadémiai Kiadó, vol. 56(3), pages 357-367, March.
    33. Xiaoguang Wang & Qikai Cheng & Wei Lu, 2014. "Analyzing evolution of research topics with NEViewer: a new method based on dynamic co-word networks," Scientometrics, Springer;Akadémiai Kiadó, vol. 101(2), pages 1253-1271, November.
    34. Jochen Gläser & Wolfgang Glänzel & Andrea Scharnhorst, 2017. "Same data—different results? Towards a comparative approach to the identification of thematic structures in science," Scientometrics, Springer;Akadémiai Kiadó, vol. 111(2), pages 981-998, May.
    35. Theresa Velden & Kevin W. Boyack & Jochen Gläser & Rob Koopman & Andrea Scharnhorst & Shenghui Wang, 2017. "Comparison of topic extraction approaches and their results," Scientometrics, Springer;Akadémiai Kiadó, vol. 111(2), pages 1169-1221, May.
    Full references (including those not matched with items on IDEAS)

    Citations

    Citations are extracted by the CitEc Project, subscribe to its RSS feed for this item.
    as


    Cited by:

    1. Haunschild, Robin & Schier, Hermann & Marx, Werner & Bornmann, Lutz, 2018. "Algorithmically generated subject categories based on citation relations: An empirical micro study using papers on overall water splitting," Journal of Informetrics, Elsevier, vol. 12(2), pages 436-447.
    2. Haiko Lietz, 0. "Drawing impossible boundaries: field delineation of Social Network Science," Scientometrics, Springer;Akadémiai Kiadó, vol. 0, pages 1-36.
    3. Paul Donner, 2021. "Validation of the Astro dataset clustering solutions with external data," Scientometrics, Springer;Akadémiai Kiadó, vol. 126(2), pages 1619-1645, February.
    4. Matthias Held & Grit Laudel & Jochen Gläser, 2021. "Challenges to the validity of topic reconstruction," Scientometrics, Springer;Akadémiai Kiadó, vol. 126(5), pages 4511-4536, May.
    5. Haiko Lietz, 2020. "Drawing impossible boundaries: field delineation of Social Network Science," Scientometrics, Springer;Akadémiai Kiadó, vol. 125(3), pages 2841-2876, December.

    Most related items

    These are the items that most often cite the same works as this one and are cited by the same works as this one.
    1. Kevin W. Boyack, 2017. "Investigating the effect of global data on topic detection," Scientometrics, Springer;Akadémiai Kiadó, vol. 111(2), pages 999-1015, May.
    2. Carlos Olmeda-Gómez & Carlos Romá-Mateo & Maria-Antonia Ovalle-Perandones, 2019. "Overview of trends in global epigenetic research (2009–2017)," Scientometrics, Springer;Akadémiai Kiadó, vol. 119(3), pages 1545-1574, June.
    3. Jielan Ding & Per Ahlgren & Liying Yang & Ting Yue, 2018. "Disciplinary structures in Nature, Science and PNAS: journal and country levels," Scientometrics, Springer;Akadémiai Kiadó, vol. 116(3), pages 1817-1852, September.
    4. Michel Zitt, 2015. "Meso-level retrieval: IR-bibliometrics interplay and hybrid citation-words methods in scientific fields delineation," Scientometrics, Springer;Akadémiai Kiadó, vol. 102(3), pages 2223-2245, March.
    5. Matthias Held & Grit Laudel & Jochen Gläser, 2021. "Challenges to the validity of topic reconstruction," Scientometrics, Springer;Akadémiai Kiadó, vol. 126(5), pages 4511-4536, May.
    6. Nees Jan Eck & Ludo Waltman, 2017. "Citation-based clustering of publications using CitNetExplorer and VOSviewer," Scientometrics, Springer;Akadémiai Kiadó, vol. 111(2), pages 1053-1070, May.
    7. Rons, Nadine, 2018. "Bibliometric approximation of a scientific specialty by combining key sources, title words, authors and references," Journal of Informetrics, Elsevier, vol. 12(1), pages 113-132.
    8. Shu, Fei & Julien, Charles-Antoine & Zhang, Lin & Qiu, Junping & Zhang, Jing & Larivière, Vincent, 2019. "Comparing journal and paper level classifications of science," Journal of Informetrics, Elsevier, vol. 13(1), pages 202-225.
    9. Fang Han & Christopher L. Magee, 2018. "Testing the science/technology relationship by analysis of patent citations of scientific papers after decomposition of both science and technology," Scientometrics, Springer;Akadémiai Kiadó, vol. 116(2), pages 767-796, August.
    10. Xu, Haiyun & Winnink, Jos & Yue, Zenghui & Zhang, Huiling & Pang, Hongshen, 2021. "Multidimensional Scientometric indicators for the detection of emerging research topics," Technological Forecasting and Social Change, Elsevier, vol. 163(C).
    11. Shuo Xu & Junwan Liu & Dongsheng Zhai & Xin An & Zheng Wang & Hongshen Pang, 2018. "Overlapping thematic structures extraction with mixed-membership stochastic blockmodel," Scientometrics, Springer;Akadémiai Kiadó, vol. 117(1), pages 61-84, October.
    12. Jochen Gläser & Wolfgang Glänzel & Andrea Scharnhorst, 2017. "Same data—different results? Towards a comparative approach to the identification of thematic structures in science," Scientometrics, Springer;Akadémiai Kiadó, vol. 111(2), pages 981-998, May.
    13. Yanto Chandra, 2018. "Mapping the evolution of entrepreneurship as a field of research (1990–2013): A scientometric analysis," PLOS ONE, Public Library of Science, vol. 13(1), pages 1-24, January.
    14. Shuo Xu & Liyuan Hao & Xin An & Hongshen Pang & Ting Li, 2020. "Review on emerging research topics with key-route main path analysis," Scientometrics, Springer;Akadémiai Kiadó, vol. 122(1), pages 607-624, January.
    15. Paul Donner, 2021. "Validation of the Astro dataset clustering solutions with external data," Scientometrics, Springer;Akadémiai Kiadó, vol. 126(2), pages 1619-1645, February.
    16. Ying Huang & Wolfgang Glänzel & Lin Zhang, 2021. "Tracing the development of mapping knowledge domains," Scientometrics, Springer;Akadémiai Kiadó, vol. 126(7), pages 6201-6224, July.
    17. Carusi, Chiara & Bianchi, Giuseppe, 2019. "Scientific community detection via bipartite scholar/journal graph co-clustering," Journal of Informetrics, Elsevier, vol. 13(1), pages 354-386.
    18. Theresa Velden & Kevin W. Boyack & Jochen Gläser & Rob Koopman & Andrea Scharnhorst & Shenghui Wang, 2017. "Comparison of topic extraction approaches and their results," Scientometrics, Springer;Akadémiai Kiadó, vol. 111(2), pages 1169-1221, May.
    19. R. Fileto Maciel & P. Saskia Bayerl & Marta Macedo Kerr Pinheiro, 2019. "Technical research innovations of the US national security system," Scientometrics, Springer;Akadémiai Kiadó, vol. 120(2), pages 539-565, August.
    20. Hric, Darko & Kaski, Kimmo & Kivelä, Mikko, 2018. "Stochastic block model reveals maps of citation patterns and their evolution in time," Journal of Informetrics, Elsevier, vol. 12(3), pages 757-783.

    Corrections

    All material on this site has been provided by the respective publishers and authors. You can help correct errors and omissions. When requesting a correction, please mention this item's handle: RePEc:eee:infome:v:12:y:2018:i:1:p:133-152. See general information about how to correct material in RePEc.

    For technical questions regarding this item, or to correct its authors, title, abstract, bibliographic or download information, contact: . General contact details of provider: http://www.elsevier.com/locate/joi .

    If you have authored this item and are not yet registered with RePEc, we encourage you to do it here. This allows to link your profile to this item. It also allows you to accept potential citations to this item that we are uncertain about.

    If CitEc recognized a bibliographic reference but did not link an item in RePEc to it, you can help with this form .

    If you know of missing items citing this one, you can help us creating those links by adding the relevant references in the same way as above, for each refering item. If you are a registered author of this item, you may also want to check the "citations" tab in your RePEc Author Service profile, as there may be some citations waiting for confirmation.

    For technical questions regarding this item, or to correct its authors, title, abstract, bibliographic or download information, contact: Catherine Liu (email available below). General contact details of provider: http://www.elsevier.com/locate/joi .

    Please note that corrections may take a couple of weeks to filter through the various RePEc services.

    IDEAS is a RePEc service hosted by the Research Division of the Federal Reserve Bank of St. Louis . RePEc uses bibliographic data supplied by the respective publishers.