IDEAS home Printed from https://ideas.repec.org/a/plo/pone00/0284567.html
   My bibliography  Save this article

Identify novel elements of knowledge with word embedding

Author

Listed:
  • Deyun Yin
  • Zhao Wu
  • Kazuki Yokota
  • Kuniko Matsumoto
  • Sotaro Shibayama

Abstract

As novelty is a core value in science, a reliable approach to measuring the novelty of scientific documents is critical. Previous novelty measures however had a few limitations. First, the majority of previous measures are based on recombinant novelty concept, attempting to identify a novel combination of knowledge elements, but insufficient effort has been made to identify a novel element itself (element novelty). Second, most previous measures are not validated, and it is unclear what aspect of newness is measured. Third, some of the previous measures can be computed only in certain scientific fields for technical constraints. This study thus aims to provide a validated and field-universal approach to computing element novelty. We drew on machine learning to develop a word embedding model, which allows us to extract semantic information from text data. Our validation analyses suggest that our word embedding model does convey semantic information. Based on the trained word embedding, we quantified the element novelty of a document by measuring its distance from the rest of the document universe. We then carried out a questionnaire survey to obtain self-reported novelty scores from 800 scientists. We found that our element novelty measure is significantly correlated with self-reported novelty in terms of discovering and identifying new phenomena, substances, molecules, etc. and that this correlation is observed across different scientific fields.

Suggested Citation

  • Deyun Yin & Zhao Wu & Kazuki Yokota & Kuniko Matsumoto & Sotaro Shibayama, 2023. "Identify novel elements of knowledge with word embedding," PLOS ONE, Public Library of Science, vol. 18(6), pages 1-16, June.
  • Handle: RePEc:plo:pone00:0284567
    DOI: 10.1371/journal.pone.0284567
    as

    Download full text from publisher

    File URL: https://journals.plos.org/plosone/article?id=10.1371/journal.pone.0284567
    Download Restriction: no

    File URL: https://journals.plos.org/plosone/article/file?id=10.1371/journal.pone.0284567&type=printable
    Download Restriction: no

    File URL: https://libkey.io/10.1371/journal.pone.0284567?utm_source=ideas
    LibKey link: if access is restricted and if your library uses this service, LibKey will redirect you to where you can use your library subscription to access this item
    ---><---

    References listed on IDEAS

    as
    1. repec:osf:socarx:2t46f_v1 is not listed on IDEAS
    2. Bryan Kelly & Dimitris Papanikolaou & Amit Seru & Matt Taddy, 2021. "Measuring Technological Innovation over the Long Run," American Economic Review: Insights, American Economic Association, vol. 3(3), pages 303-320, September.
    3. Wang, Jian & Veugelers, Reinhilde & Stephan, Paula, 2017. "Bias against novelty in science: A cautionary tale for users of bibliometric indicators," Research Policy, Elsevier, vol. 46(8), pages 1416-1436.
    4. Hall, B. & Jaffe, A. & Trajtenberg, M., 2001. "The NBER Patent Citations Data File: Lessons, Insights and Methodological Tools," Papers 2001-29, Tel Aviv.
    5. Bornmann, Lutz & Tekles, Alexander & Zhang, Helena H. & Ye, Fred Y., 2019. "Do we measure novelty when we analyze unusual combinations of cited references? A validation study of bibliometric novelty indicators based on F1000Prime data," Journal of Informetrics, Elsevier, vol. 13(4).
    6. Trapido, Denis, 2015. "How novelty in knowledge earns recognition: The role of consistent identities," Research Policy, Elsevier, vol. 44(8), pages 1488-1500.
    7. Stephan, Paula E., 2010. "The Economics of Science," Handbook of the Economics of Innovation, in: Bronwyn H. Hall & Nathan Rosenberg (ed.), Handbook of the Economics of Innovation, edition 1, volume 1, chapter 0, pages 217-273, Elsevier.
    8. Strumsky, Deborah & Lobo, José, 2015. "Identifying the sources of technological novelty in the process of invention," Research Policy, Elsevier, vol. 44(8), pages 1445-1461.
    9. Dahlin, Kristina B. & Behrens, Dean M., 2005. "When is an invention really radical?: Defining and measuring technological radicalness," Research Policy, Elsevier, vol. 34(5), pages 717-737, June.
    10. Gautam Ahuja & Curba Morris Lampert, 2001. "Entrepreneurship in the large corporation: a longitudinal study of how established firms create breakthrough inventions," Strategic Management Journal, Wiley Blackwell, vol. 22(6‐7), pages 521-543, June.
    11. Veugelers, Reinhilde & Wang, Jian, 2019. "Scientific novelty and technological impact," Research Policy, Elsevier, vol. 48(6), pages 1362-1372.
    12. Sotaro Shibayama & Deyun Yin & Kuniko Matsumoto, 2021. "Measuring novelty in science with word embedding," PLOS ONE, Public Library of Science, vol. 16(7), pages 1-16, July.
    13. Manuel Trajtenberg & Rebecca Henderson & Adam Jaffe, 1992. "Ivory Tower Versus Corporate Lab: An Empirical Study of Basic Research and Appropriability," NBER Working Papers 4146, National Bureau of Economic Research, Inc.
    14. Kevin J. Boudreau & Eva C. Guinan & Karim R. Lakhani & Christoph Riedl, 2016. "Looking Across and Looking Beyond the Knowledge Frontier: Intellectual Distance, Novelty, and Resource Allocation in Science," Management Science, INFORMS, vol. 62(10), pages 2765-2783, October.
    15. Zhang, Xinyuan & Xie, Qing & Song, Min, 2021. "Measuring the impact of novelty, bibliometric, and academic-network factors on citation count using a neural network," Journal of Informetrics, Elsevier, vol. 15(2).
    16. Fontana, Magda & Iori, Martina & Montobbio, Fabio & Sinatra, Roberta, 2020. "New and atypical combinations: An assessment of novelty and interdisciplinarity," Research Policy, Elsevier, vol. 49(7).
    17. Kristina Dahlin & Deans M. Behrens, 2005. "When is an invention really radical? Defining and measuring technological radicalness," Post-Print hal-00480416, HAL.
    18. Foster, Jacob G. & Shi, Feng & Evans, James, 2021. "Surprise! Measuring Novelty as Expectation Violation," SocArXiv 2t46f, Center for Open Science.
    19. Lee Fleming, 2001. "Recombinant Uncertainty in Technological Search," Management Science, INFORMS, vol. 47(1), pages 117-132, January.
    Full references (including those not matched with items on IDEAS)

    Most related items

    These are the items that most often cite the same works as this one and are cited by the same works as this one.
    1. Sotaro Shibayama & Deyun Yin & Kuniko Matsumoto, 2021. "Measuring novelty in science with word embedding," PLOS ONE, Public Library of Science, vol. 16(7), pages 1-16, July.
    2. Ron Boschma & Ernest Miguelez & Rosina Moreno & Diego B. Ocampo-Corrales, 2021. "Technological breakthroughs in European regions: the role of related and unrelated combinations," Papers in Evolutionary Economic Geography (PEEG) 2118, Utrecht University, Department of Human Geography and Spatial Planning, Group Economic Geography, revised Jun 2021.
    3. Quentin Plantec & Pascal Le Masson & Benoit Weil, 2020. "Impact of knowledge search practices on the originality of inventions: a study in the oil & gas industry," Post-Print hal-02613665, HAL.
    4. Michele Cincera & Ela Ince, 2019. "Types of Innovation and Firm performance," Working Papers TIMES² 2019-032, ULB -- Universite Libre de Bruxelles.
    5. Yi Zhao & Chengzhi Zhang, 2025. "A review on the novelty measurements of academic papers," Scientometrics, Springer;Akadémiai Kiadó, vol. 130(2), pages 727-753, February.
    6. Plantec, Quentin & Le Masson, Pascal & Weil, Benoît, 2021. "Impact of knowledge search practices on the originality of inventions: A study in the oil & gas industry through dynamic patent analysis," Technological Forecasting and Social Change, Elsevier, vol. 168(C).
    7. Kuniko Matsumoto & Sotaro Shibayama & Byeongwoo Kang & Masatsura Igami, 2021. "Introducing a novelty indicator for scientific research: validating the knowledge-based combinatorial approach," Scientometrics, Springer;Akadémiai Kiadó, vol. 126(8), pages 6891-6915, August.
    8. Verhoeven, Dennis & Bakker, Jurriën & Veugelers, Reinhilde, 2016. "Measuring technological novelty with patent-based indicators," Research Policy, Elsevier, vol. 45(3), pages 707-723.
    9. Dirk Fornahl & Nils Grashof & Alexander Kopka, 2021. "Do not neglect the periphery?! - the emergence and diffusion of radical innovations," Bremen Papers on Economics & Innovation 2102, University of Bremen, Faculty of Business Studies and Economics.
    10. Jan M. Gerken & Martin G. Moehrle, 2012. "A new instrument for technology monitoring: novelty in patents measured by semantic patent analysis," Scientometrics, Springer;Akadémiai Kiadó, vol. 91(3), pages 645-670, June.
    11. Kolja Hesse & Dirk Fornahl, 2020. "Essential ingredients for radical innovations? The role of (un‐)related variety and external linkages in Germany," Papers in Regional Science, Wiley Blackwell, vol. 99(5), pages 1165-1183, October.
    12. Dongqing Lyu & Kaile Gong & Xuanmin Ruan & Ying Cheng & Jiang Li, 2021. "Does research collaboration influence the “disruption” of articles? Evidence from neurosciences," Scientometrics, Springer;Akadémiai Kiadó, vol. 126(1), pages 287-303, January.
    13. Ugo Rizzo & Nicolò Barbieri & Laura Ramaciotti & Demian Iannantuono, 2020. "The division of labour between academia and industry for the generation of radical inventions," The Journal of Technology Transfer, Springer, vol. 45(2), pages 393-413, April.
    14. Yuandi Wang & Xiongfeng Pan & Yantai Chen & Xin Gu, 2013. "Do references in transferred patent documents signal learning opportunities for the receiving firms?," Scientometrics, Springer;Akadémiai Kiadó, vol. 95(2), pages 731-752, May.
    15. Barbieri, Nicolò & Marzucchi, Alberto & Rizzo, Ugo, 2020. "Knowledge sources and impacts on subsequent inventions: Do green technologies differ from non-green ones?," Research Policy, Elsevier, vol. 49(2).
    16. Sarah Kaplan & Keyvan Vakili, 2015. "The double-edged sword of recombination in breakthrough innovation," Strategic Management Journal, Wiley Blackwell, vol. 36(10), pages 1435-1457, October.
    17. Pierre Pelletier & Kevin Wirtz, 2023. "Sails and Anchors: The Complementarity of Exploratory and Exploitative Scientists in Knowledge Creation," Papers 2312.10476, arXiv.org.
    18. Sandro Montresor & Gianluca Orsatti & Francesco Quatraro, 2023. "Technological novelty and key enabling technologies: evidence from European regions," Economics of Innovation and New Technology, Taylor & Francis Journals, vol. 32(6), pages 851-872, August.
    19. Plantec, Quentin & Deval, Marie-Alix & Hooge, Sophie & Weil, Benoit, 2023. "Big data as an exploration trigger or problem-solving patch: Design and integration of AI-embedded systems in the automotive industry," Technovation, Elsevier, vol. 124(C).
    20. Kathryn Rudie Harrigan & Maria Chiara DiGuardo, 2017. "Sustainability of patent-based competitive advantage in the U.S. communications services industry," The Journal of Technology Transfer, Springer, vol. 42(6), pages 1334-1361, December.

    More about this item

    Statistics

    Access and download statistics

    Corrections

    All material on this site has been provided by the respective publishers and authors. You can help correct errors and omissions. When requesting a correction, please mention this item's handle: RePEc:plo:pone00:0284567. See general information about how to correct material in RePEc.

    If you have authored this item and are not yet registered with RePEc, we encourage you to do it here. This allows to link your profile to this item. It also allows you to accept potential citations to this item that we are uncertain about.

    If CitEc recognized a bibliographic reference but did not link an item in RePEc to it, you can help with this form .

    If you know of missing items citing this one, you can help us creating those links by adding the relevant references in the same way as above, for each refering item. If you are a registered author of this item, you may also want to check the "citations" tab in your RePEc Author Service profile, as there may be some citations waiting for confirmation.

    For technical questions regarding this item, or to correct its authors, title, abstract, bibliographic or download information, contact: plosone (email available below). General contact details of provider: https://journals.plos.org/plosone/ .

    Please note that corrections may take a couple of weeks to filter through the various RePEc services.

    IDEAS is a RePEc service. RePEc uses bibliographic data supplied by the respective publishers.