IDEAS home Printed from https://ideas.repec.org/a/spr/scient/v116y2018i2d10.1007_s11192-018-2783-x.html
   My bibliography  Save this article

Funding map using paragraph embedding based on semantic diversity

Author

Listed:
  • Takahiro Kawamura

    (Japan Science and Technology Agency)

  • Katsutaro Watanabe

    (Japan Science and Technology Agency)

  • Naoya Matsumoto

    (Japan Science and Technology Agency)

  • Shusaku Egami

    (Japan Science and Technology Agency)

  • Mari Jibu

    (Japan Science and Technology Agency)

Abstract

Maps of science representing the structure of science can help us understand science and technology (S&T) development. Studies have thus developed techniques for analyzing research activities’ relationships; however, ongoing research projects and recently published papers have difficulty in applying inter-citation and co-citation analysis. Therefore, in order to characterize what is currently being attempted in the scientific landscape, this paper proposes a new content-based method of locating research projects in a multi-dimensional space using the recent word/paragraph embedding techniques. Specifically, for addressing an unclustered problem associated with the original paragraph vectors, we introduce paragraph vectors based on the information entropies of concepts in an S&T thesaurus. The experimental results show that the proposed method successfully formed a clustered map from 25,607 project descriptions of the 7th Framework Programme of EU from 2006 to 2016 and 34,192 project descriptions of the National Science Foundation from 2012 to 2016.

Suggested Citation

  • Takahiro Kawamura & Katsutaro Watanabe & Naoya Matsumoto & Shusaku Egami & Mari Jibu, 2018. "Funding map using paragraph embedding based on semantic diversity," Scientometrics, Springer;Akadémiai Kiadó, vol. 116(2), pages 941-958, August.
  • Handle: RePEc:spr:scient:v:116:y:2018:i:2:d:10.1007_s11192-018-2783-x
    DOI: 10.1007/s11192-018-2783-x
    as

    Download full text from publisher

    File URL: http://link.springer.com/10.1007/s11192-018-2783-x
    File Function: Abstract
    Download Restriction: Access to the full text of the articles in this series is restricted.

    File URL: https://libkey.io/10.1007/s11192-018-2783-x?utm_source=ideas
    LibKey link: if access is restricted and if your library uses this service, LibKey will redirect you to where you can use your library subscription to access this item
    ---><---

    As the access to this document is restricted, you may want to search for a different version of it.

    References listed on IDEAS

    as
    1. Richard Klavans & Kevin W. Boyack, 2017. "Which Type of Citation Analysis Generates the Most Accurate Taxonomy of Scientific and Technical Knowledge?," Journal of the Association for Information Science & Technology, Association for Information Science & Technology, vol. 68(4), pages 984-998, April.
    2. Rob Koopman & Shenghui Wang & Andrea Scharnhorst, 2017. "Contextualization of topics: browsing through the universe of bibliographic information," Scientometrics, Springer;Akadémiai Kiadó, vol. 111(2), pages 1119-1139, May.
    3. Klavans, Richard & Boyack, Kevin W., 2017. "Research portfolio analysis and topic prominence," Journal of Informetrics, Elsevier, vol. 11(4), pages 1158-1174.
    4. Kevin W. Boyack & Henry Small & Richard Klavans, 2013. "Improving the accuracy of co-citation clustering using full text," Journal of the Association for Information Science & Technology, Association for Information Science & Technology, vol. 64(9), pages 1759-1767, September.
    5. Kevin W. Boyack & Richard Klavans & Katy Börner, 2005. "Mapping the backbone of science," Scientometrics, Springer;Akadémiai Kiadó, vol. 64(3), pages 351-374, August.
    6. Shenghui Wang & Rob Koopman, 2017. "Clustering articles based on semantic similarity," Scientometrics, Springer;Akadémiai Kiadó, vol. 111(2), pages 1017-1031, May.
    7. Ahlgren, Per & Colliander, Cristian, 2009. "Document–document similarity approaches and science mapping: Experimental comparison of five approaches," Journal of Informetrics, Elsevier, vol. 3(1), pages 49-63.
    Full references (including those not matched with items on IDEAS)

    Citations

    Citations are extracted by the CitEc Project, subscribe to its RSS feed for this item.
    as


    Cited by:

    1. Kamal Sanguri & Atanu Bhuyan & Sabyasachi Patra, 2020. "A semantic similarity adjusted document co-citation analysis: a case of tourism supply chain," Scientometrics, Springer;Akadémiai Kiadó, vol. 125(1), pages 233-269, October.
    2. Yi Zhang & Fen Zhao & Jianguo Lu, 2019. "P2V: large-scale academic paper embedding," Scientometrics, Springer;Akadémiai Kiadó, vol. 121(1), pages 399-432, October.

    Most related items

    These are the items that most often cite the same works as this one and are cited by the same works as this one.
    1. Paul Donner, 2021. "Validation of the Astro dataset clustering solutions with external data," Scientometrics, Springer;Akadémiai Kiadó, vol. 126(2), pages 1619-1645, February.
    2. Shenghui Wang & Rob Koopman, 2017. "Clustering articles based on semantic similarity," Scientometrics, Springer;Akadémiai Kiadó, vol. 111(2), pages 1017-1031, May.
    3. Sjögårde, Peter & Ahlgren, Per, 2018. "Granularity of algorithmically constructed publication-level classifications of research publications: Identification of topics," Journal of Informetrics, Elsevier, vol. 12(1), pages 133-152.
    4. Jochen Gläser & Wolfgang Glänzel & Andrea Scharnhorst, 2017. "Same data—different results? Towards a comparative approach to the identification of thematic structures in science," Scientometrics, Springer;Akadémiai Kiadó, vol. 111(2), pages 981-998, May.
    5. Ludo Waltman & Nees Jan Eck, 2012. "A new methodology for constructing a publication-level classification system of science," Journal of the Association for Information Science & Technology, Association for Information Science & Technology, vol. 63(12), pages 2378-2392, December.
    6. Michel Zitt, 2015. "Meso-level retrieval: IR-bibliometrics interplay and hybrid citation-words methods in scientific fields delineation," Scientometrics, Springer;Akadémiai Kiadó, vol. 102(3), pages 2223-2245, March.
    7. Carusi, Chiara & Bianchi, Giuseppe, 2019. "Scientific community detection via bipartite scholar/journal graph co-clustering," Journal of Informetrics, Elsevier, vol. 13(1), pages 354-386.
    8. Matthias Held & Grit Laudel & Jochen Gläser, 2021. "Challenges to the validity of topic reconstruction," Scientometrics, Springer;Akadémiai Kiadó, vol. 126(5), pages 4511-4536, May.
    9. Rob Koopman & Shenghui Wang, 2017. "Mutual information based labelling and comparing clusters," Scientometrics, Springer;Akadémiai Kiadó, vol. 111(2), pages 1157-1167, May.
    10. Theresa Velden & Kevin W. Boyack & Jochen Gläser & Rob Koopman & Andrea Scharnhorst & Shenghui Wang, 2017. "Comparison of topic extraction approaches and their results," Scientometrics, Springer;Akadémiai Kiadó, vol. 111(2), pages 1169-1221, May.
    11. Cristian Colliander & Per Ahlgren, 2012. "Experimental comparison of first and second-order similarities in a scientometric context," Scientometrics, Springer;Akadémiai Kiadó, vol. 90(2), pages 675-685, February.
    12. Bart Thijs, 2020. "Using neural-network based paragraph embeddings for the calculation of within and between document similarities," Scientometrics, Springer;Akadémiai Kiadó, vol. 125(2), pages 835-849, November.
    13. Fabian Meyer-Brötz & Edgar Schiebel & Leo Brecht, 2017. "Experimental evaluation of parameter settings in calculation of hybrid similarities: effects of first- and second-order similarity, edge cutting, and weighting factors," Scientometrics, Springer;Akadémiai Kiadó, vol. 111(3), pages 1307-1325, June.
    14. Hric, Darko & Kaski, Kimmo & Kivelä, Mikko, 2018. "Stochastic block model reveals maps of citation patterns and their evolution in time," Journal of Informetrics, Elsevier, vol. 12(3), pages 757-783.
    15. Keisuke Okamura, 2019. "Interdisciplinarity revisited: evidence for research impact and dynamism," Palgrave Communications, Palgrave Macmillan, vol. 5(1), pages 1-9, December.
    16. Sitaram Devarakonda & Dmitriy Korobskiy & Tandy Warnow & George Chacko, 2020. "Viewing computer science through citation analysis: Salton and Bergmark Redux," Scientometrics, Springer;Akadémiai Kiadó, vol. 125(1), pages 271-287, October.
    17. Rongrong Li & Xuefeng Wang, 2019. "Imbalances between the Quantity and Quality of China’s Solar Energy Research," Sustainability, MDPI, vol. 11(3), pages 1-15, January.
    18. Shu, Fei & Julien, Charles-Antoine & Zhang, Lin & Qiu, Junping & Zhang, Jing & Larivière, Vincent, 2019. "Comparing journal and paper level classifications of science," Journal of Informetrics, Elsevier, vol. 13(1), pages 202-225.
    19. Jing Zhang & Xiaomin Liu & Lili Wu, 2016. "The study of subject-classification based on journal coupling and expert subject-classification system," Scientometrics, Springer;Akadémiai Kiadó, vol. 107(3), pages 1149-1170, June.
    20. Gómez-Núñez, Antonio J. & Batagelj, Vladimir & Vargas-Quesada, Benjamín & Moya-Anegón, Félix & Chinchilla-Rodríguez, Zaida, 2014. "Optimizing SCImago Journal & Country Rank classification by community detection," Journal of Informetrics, Elsevier, vol. 8(2), pages 369-383.

    Corrections

    All material on this site has been provided by the respective publishers and authors. You can help correct errors and omissions. When requesting a correction, please mention this item's handle: RePEc:spr:scient:v:116:y:2018:i:2:d:10.1007_s11192-018-2783-x. See general information about how to correct material in RePEc.

    If you have authored this item and are not yet registered with RePEc, we encourage you to do it here. This allows to link your profile to this item. It also allows you to accept potential citations to this item that we are uncertain about.

    If CitEc recognized a bibliographic reference but did not link an item in RePEc to it, you can help with this form .

    If you know of missing items citing this one, you can help us creating those links by adding the relevant references in the same way as above, for each refering item. If you are a registered author of this item, you may also want to check the "citations" tab in your RePEc Author Service profile, as there may be some citations waiting for confirmation.

    For technical questions regarding this item, or to correct its authors, title, abstract, bibliographic or download information, contact: Sonal Shukla or Springer Nature Abstracting and Indexing (email available below). General contact details of provider: http://www.springer.com .

    Please note that corrections may take a couple of weeks to filter through the various RePEc services.

    IDEAS is a RePEc service. RePEc uses bibliographic data supplied by the respective publishers.