IDEAS home Printed from https://ideas.repec.org/a/plo/pone00/0176310.html
   My bibliography  Save this article

Classifying patents based on their semantic content

Author

Listed:
  • Antonin Bergeaud
  • Yoann Potiron
  • Juste Raimbault

Abstract

In this paper, we extend some usual techniques of classification resulting from a large-scale data-mining and network approach. This new technology, which in particular is designed to be suitable to big data, is used to construct an open consolidated database from raw data on 4 million patents taken from the US patent office from 1976 onward. To build the pattern network, not only do we look at each patent title, but we also examine their full abstract and extract the relevant keywords accordingly. We refer to this classification as semantic approach in contrast with the more common technological approach which consists in taking the topology when considering US Patent office technological classes. Moreover, we document that both approaches have highly different topological measures and strong statistical evidence that they feature a different model. This suggests that our method is a useful tool to extract endogenous information.

Suggested Citation

  • Antonin Bergeaud & Yoann Potiron & Juste Raimbault, 2017. "Classifying patents based on their semantic content," PLOS ONE, Public Library of Science, vol. 12(4), pages 1-22, April.
  • Handle: RePEc:plo:pone00:0176310
    DOI: 10.1371/journal.pone.0176310
    as

    Download full text from publisher

    File URL: https://journals.plos.org/plosone/article?id=10.1371/journal.pone.0176310
    Download Restriction: no

    File URL: https://journals.plos.org/plosone/article/file?id=10.1371/journal.pone.0176310&type=printable
    Download Restriction: no

    File URL: https://libkey.io/10.1371/journal.pone.0176310?utm_source=ideas
    LibKey link: if access is restricted and if your library uses this service, LibKey will redirect you to where you can use your library subscription to access this item
    ---><---

    Other versions of this item:

    References listed on IDEAS

    as
    1. Dechezlepretre, Antoine & Martin, Ralf & Mohnen, Myra, 2014. "Knowledge spillovers from clean and dirty technologies," LSE Research Online Documents on Economics 60501, London School of Economics and Political Science, LSE Library.
    2. Archibugi, Daniele & Pianta, Mario, 1992. "Specialization and size of technological activities in industrial countries: The analysis of patent data," Research Policy, Elsevier, vol. 21(1), pages 79-93, February.
    3. Nicholas Bloom & Mark Schankerman & John Van Reenen, 2013. "Identifying Technology Spillovers and Product Market Rivalry," Econometrica, Econometric Society, vol. 81(4), pages 1347-1393, July.
    4. repec:fth:harver:1473 is not listed on IDEAS
    5. Sarah Kaplan & Keyvan Vakili, 2015. "The double-edged sword of recombination in breakthrough innovation," Strategic Management Journal, Wiley Blackwell, vol. 36(10), pages 1435-1457, October.
    6. Aghion, Philippe & Howitt, Peter, 1992. "A Model of Growth through Creative Destruction," Econometrica, Econometric Society, vol. 60(2), pages 323-351, March.
    7. David Chavalarias & Jean-Philippe Cointet, 2013. "Phylomemetic Patterns in Science Evolution—The Rise and Fall of Scientific Fields," PLOS ONE, Public Library of Science, vol. 8(2), pages 1-11, February.
    8. Luciano Kay & Nils Newman & Jan Youtie & Alan L. Porter & Ismael Rafols, 2014. "Patent overlay mapping: Visualizing technological distance," Journal of the Association for Information Science & Technology, Association for Information Science & Technology, vol. 65(12), pages 2432-2443, December.
    9. Bronwyn H. Hall & Adam B. Jaffe & Manuel Trajtenberg, 2001. "The NBER Patent Citation Data File: Lessons, Insights and Methodological Tools," NBER Working Papers 8498, National Bureau of Economic Research, Inc.
    10. Philippe Aghion & Antonin Bergeaud & Matthieu Lequien & Marc J. Melitz, 2024. "The Heterogeneous Impact of Market Size on Innovation: Evidence from French Firm-Level Exports," The Review of Economics and Statistics, MIT Press, vol. 106(3), pages 608-626, May.
    11. Jan M. Gerken & Martin G. Moehrle, 2012. "A new instrument for technology monitoring: novelty in patents measured by semantic patent analysis," Scientometrics, Springer;Akadémiai Kiadó, vol. 91(3), pages 645-670, June.
    12. Pierre Régibeau & Katharine Rockett, 2010. "Innovation Cycles And Learning At The Patent Office: Does The Early Patent Get The Delay?," Journal of Industrial Economics, Wiley Blackwell, vol. 58(2), pages 222-246, June.
    13. Olav Sorenson & Jan W. Rivkin & Lee Fleming, 2010. "Complexity, Networks and Knowledge Flows," Chapters, in: Ron Boschma & Ron Martin (ed.), The Handbook of Evolutionary Economic Geography, chapter 15, Edward Elgar Publishing.
    14. Jeffrey L. Furman & Scott Stern, 2011. "Climbing atop the Shoulders of Giants: The Impact of Institutions on Cumulative Research," American Economic Review, American Economic Association, vol. 101(5), pages 1933-1963, August.
    15. Zvi Griliches, 1998. "Patent Statistics as Economic Indicators: A Survey," NBER Chapters, in: R&D and Productivity: The Econometric Evidence, pages 287-343, National Bureau of Economic Research, Inc.
    16. Fattori, Michele & Pedrazzi, Giorgio & Turra, Roberta, 2003. "Text mining applied to patent mapping: a practical business case," World Patent Information, Elsevier, vol. 25(4), pages 335-342, December.
    17. Romer, Paul M, 1990. "Endogenous Technological Change," Journal of Political Economy, University of Chicago Press, vol. 98(5), pages 71-102, October.
    18. Adams, Stephen, 2010. "The text, the full text and nothing but the text: Part 1 - Standards for creating textual information in patent documents and general search implications," World Patent Information, Elsevier, vol. 32(1), pages 22-29, March.
    19. Choi, Jinho & Hwang, Yong-Sik, 2014. "Patent keyword network analysis for improving technology development efficiency," Technological Forecasting and Social Change, Elsevier, vol. 83(C), pages 170-182.
    20. Manlio De Domenico & Albert Solé-Ribalta & Elisa Omodei & Sergio Gómez & Alex Arenas, 2015. "Ranking in interconnected multilayer networks reveals versatile nodes," Nature Communications, Nature, vol. 6(1), pages 1-6, November.
    21. Philippe Aghion, Antonin Bergeaud, Matthieu Lequien, Marc J. Melitz, 2018. "The Impact of Exports on Innovation: Theory and Evidence," Working papers 678, Banque de France.
    22. Katz, Michael L, 1996. "Remarks on the Economic Implications of Convergence," Industrial and Corporate Change, Oxford University Press and the Associazione ICC, vol. 5(4), pages 1079-1095.
    Full references (including those not matched with items on IDEAS)

    Citations

    Citations are extracted by the CitEc Project, subscribe to its RSS feed for this item.
    as


    Cited by:

    1. Ananthan Nambiar & Tobias Rubel & James McCaull & Jon deVries & Mark Bedau, 2021. "Dropping diversity of products of large US firms: Models and measures," Papers 2110.08367, arXiv.org.
    2. Juste Raimbault, 2019. "Exploration of an interdisciplinary scientific landscape," Scientometrics, Springer;Akadémiai Kiadó, vol. 119(2), pages 617-641, May.
    3. Philippe Aghion & Antonin Bergeaud & John Van Reenen, 2023. "The Impact of Regulation on Innovation," American Economic Review, American Economic Association, vol. 113(11), pages 2894-2936, November.
    4. David Lenz & Peter Winker, 2020. "Measuring the diffusion of innovations with paragraph vector topic models," PLOS ONE, Public Library of Science, vol. 15(1), pages 1-18, January.
    5. A. Fronzetti Colladon & B. Guardabascio & F. Venturini, 2023. "A new mapping of technological interdependence," Papers 2308.00014, arXiv.org, revised Sep 2024.
    6. Sarah Oh, 2020. "Radio “Fences” and Inventor Attention to Property Rights: Evidence from Wireless Patents," Review of Industrial Organization, Springer;The Industrial Organization Society, vol. 56(1), pages 37-72, February.
    7. Gątkowski, Mateusz & Dietl, Marek & Skrok, Lukasz & Whalen, Ryan & Rockett, Katharine, 2018. "Patent Thickets Identification," Economics Discussion Papers 22928, University of Essex, Department of Economics.
    8. Arts, Sam & Hou, Jianan & Gomez, Juan Carlos, 2021. "Natural language processing to identify the creation and impact of new technologies in patent text: Code, data, and new measures," Research Policy, Elsevier, vol. 50(2).
    9. Antoine Peris & Evert Meijers & Maarten Ham, 2018. "The Evolution of the Systems of Cities Literature Since 1995: Schools of Thought and their Interaction," Networks and Spatial Economics, Springer, vol. 18(3), pages 533-554, September.
    10. Jonathan H. Ashtor, 2019. "Investigating Cohort Similarity as an Ex Ante Alternative to Patent Forward Citations," Journal of Empirical Legal Studies, John Wiley & Sons, vol. 16(4), pages 848-880, December.
    11. Sijie Feng, 2020. "The proximity of ideas: An analysis of patent text using machine learning," PLOS ONE, Public Library of Science, vol. 15(7), pages 1-19, July.
    12. Jeffrey P. Clemens & Parker Rogers, 2020. "Demand Shocks, Procurement Policies, and the Nature of Medical Innovation: Evidence from Wartime Prosthetic Device Patents," CESifo Working Paper Series 8781, CESifo.
    13. Gątkowski, Mateusz & Dietl, Marek & Skrok, Łukasz & Whalen, Ryan & Rockett, Katharine, 2020. "Semantically-based patent thicket identification," Research Policy, Elsevier, vol. 49(2).

    Most related items

    These are the items that most often cite the same works as this one and are cited by the same works as this one.
    1. Alex Bell & Raj Chetty & Xavier Jaravel & Neviana Petkova & John Van Reenen, 2019. "Who Becomes an Inventor in America? The Importance of Exposure to Innovation," The Quarterly Journal of Economics, President and Fellows of Harvard College, vol. 134(2), pages 647-713.
    2. Bergeaud Antonin & Schmidt Julia & Zago Riccardo, 2022. "Patents that Match your Standards: Firm-level Evidence on Competition and Growth," Working papers 876, Banque de France.
    3. Daron Acemoglu & Ufuk Akcigit & William R. Kerr, 2016. "Innovation Network," CESifo Working Paper Series 6173, CESifo.
    4. Felix Bracht & Dennis Verhoeven, 2021. "Air pollution and innovation," CEP Discussion Papers dp1817, Centre for Economic Performance, LSE.
    5. Choi, Mincheol & Lee, Chang-Yang, 2021. "Technological diversification and R&D productivity: The moderating effects of knowledge spillovers and core-technology competence," Technovation, Elsevier, vol. 104(C).
    6. Nathan Goldschlag & Elisabeth Perlman, 2017. "Business Dynamic Statistics of Innovative Firms," Working Papers 17-72, Center for Economic Studies, U.S. Census Bureau.
    7. Ufuk Akcigit & William R. Kerr, 2018. "Growth through Heterogeneous Innovations," Journal of Political Economy, University of Chicago Press, vol. 126(4), pages 1374-1443.
    8. Viral V. Acharya & Ramin P. Baghai & Krishnamurthy V. Subramanian, 2013. "Labor Laws and Innovation," Journal of Law and Economics, University of Chicago Press, vol. 56(4), pages 997-1037.
    9. Antonin Bergeaud & Julia Schmidt & Riccardo Zago, 2022. "Patents that match your standards: firm-level evidence on competition and innovation," CEP Discussion Papers dp1881, Centre for Economic Performance, LSE.
    10. Neves, Pedro Cunha & Sequeira, Tiago Neves, 2018. "Spillovers in the production of knowledge: A meta-regression analysis," Research Policy, Elsevier, vol. 47(4), pages 750-767.
    11. Hasan, Iftekhar & Tucci, Christopher L., 2010. "The innovation-economic growth nexus: Global evidence," Research Policy, Elsevier, vol. 39(10), pages 1264-1276, December.
    12. Minniti, Antonio & Venturini, Francesco, 2017. "The long-run growth effects of R&D policy," Research Policy, Elsevier, vol. 46(1), pages 316-326.
    13. Krammer, Sorin M.S., 2009. "Drivers of national innovation in transition: Evidence from a panel of Eastern European countries," Research Policy, Elsevier, vol. 38(5), pages 845-860, June.
    14. A. Fronzetti Colladon & B. Guardabascio & F. Venturini, 2023. "A new mapping of technological interdependence," Papers 2308.00014, arXiv.org, revised Sep 2024.
    15. Petra Moser & Joerg Ohmstedt & Paul W. Rhode, 2018. "Patent Citations—An Analysis of Quality Differences and Citing Practices in Hybrid Corn," Management Science, INFORMS, vol. 64(4), pages 1926-1940, April.
    16. Arts, Sam & Hou, Jianan & Gomez, Juan Carlos, 2021. "Natural language processing to identify the creation and impact of new technologies in patent text: Code, data, and new measures," Research Policy, Elsevier, vol. 50(2).
    17. Galasso, Alberto & Schankerman, Mark, 2013. "Patents and Cumulative Innovation:Causal Evidence from the Courts," IIR Working Paper 13-16, Institute of Innovation Research, Hitotsubashi University.
    18. Pauly, Stefan & Stipanicic, Fernando, 2021. "The creation and diffusion of knowledge: Evidence from the Jet Age," CEPREMAP Working Papers (Docweb) 2112, CEPREMAP.
    19. Loet Leydesdorff & Dieter Franz Kogler & Bowen Yan, 2017. "Mapping patent classifications: portfolio and statistical analysis, and the comparison of strengths and weaknesses," Scientometrics, Springer;Akadémiai Kiadó, vol. 112(3), pages 1573-1591, September.
    20. Stephen G. Dimmock & Jiekun Huang & Scott J. Weisbenner, 2022. "Give Me Your Tired, Your Poor, Your High-Skilled Labor: H-1B Lottery Outcomes and Entrepreneurial Success," Management Science, INFORMS, vol. 68(9), pages 6950-6970, September.

    More about this item

    JEL classification:

    • O3 - Economic Development, Innovation, Technological Change, and Growth - - Innovation; Research and Development; Technological Change; Intellectual Property Rights
    • O39 - Economic Development, Innovation, Technological Change, and Growth - - Innovation; Research and Development; Technological Change; Intellectual Property Rights - - - Other

    Statistics

    Access and download statistics

    Corrections

    All material on this site has been provided by the respective publishers and authors. You can help correct errors and omissions. When requesting a correction, please mention this item's handle: RePEc:plo:pone00:0176310. See general information about how to correct material in RePEc.

    If you have authored this item and are not yet registered with RePEc, we encourage you to do it here. This allows to link your profile to this item. It also allows you to accept potential citations to this item that we are uncertain about.

    If CitEc recognized a bibliographic reference but did not link an item in RePEc to it, you can help with this form .

    If you know of missing items citing this one, you can help us creating those links by adding the relevant references in the same way as above, for each refering item. If you are a registered author of this item, you may also want to check the "citations" tab in your RePEc Author Service profile, as there may be some citations waiting for confirmation.

    For technical questions regarding this item, or to correct its authors, title, abstract, bibliographic or download information, contact: plosone (email available below). General contact details of provider: https://journals.plos.org/plosone/ .

    Please note that corrections may take a couple of weeks to filter through the various RePEc services.

    IDEAS is a RePEc service. RePEc uses bibliographic data supplied by the respective publishers.