IDEAS home Printed from https://ideas.repec.org/p/nbr/nberwo/15851.html
   My bibliography  Save this paper

Harmonizing and Combining Large Datasets - An Application to Firm-Level Patent and Accounting Data

Author

Listed:
  • Grid Thoma
  • Salvatore Torrisi
  • Alfonso Gambardella
  • Dominique Guellec
  • Bronwyn H. Hall
  • Dietmar Harhoff

Abstract

This paper discusses methods for the harmonization and combination of large-scale patent and trademark datasets with each other and other sources of data. Dictionary- and rule-based approaches to the consolidation of applicant names in patent data are presented and shown to have both benefits and drawbacks in isolation. We combine the two methods and develop a set of rules and dictionaries to consolidate European, Patent Cooperation Treaty (PCT) and US patent data with firm accounting data. The resulting data encompass about 131,000 patent applicant names from 46 countries, covering 58.8 percent of EPO applications and 50.6 percent of PCT applications by business organizations during the time period from 1979 to 2008. For US data, the resulting dataset includes around 54,000 assignee names and 51.3 percent of US granted patents during approximately the same time period.

Suggested Citation

  • Grid Thoma & Salvatore Torrisi & Alfonso Gambardella & Dominique Guellec & Bronwyn H. Hall & Dietmar Harhoff, 2010. "Harmonizing and Combining Large Datasets - An Application to Firm-Level Patent and Accounting Data," NBER Working Papers 15851, National Bureau of Economic Research, Inc.
  • Handle: RePEc:nbr:nberwo:15851
    Note: PR
    as

    Download full text from publisher

    File URL: http://www.nber.org/papers/w15851.pdf
    Download Restriction: no
    ---><---

    References listed on IDEAS

    as
    1. Fosfuri, Andrea & Giarratana, Marco S., 2004. "Product strategies and startups' survival in turbulent industries: evidence from the security software industry," DEE - Working Papers. Business Economics. WB wb044816, Universidad Carlos III de Madrid. Departamento de Economía de la Empresa.
    2. Christine Greenhalgh & Mark Rogers, 2007. "The value of intellectual property rights to firms and society," Oxford Review of Economic Policy, Oxford University Press, vol. 23(4), pages 541-567, Winter.
    3. Bronwyn H. Hall & Adam Jaffe & Manuel Trajtenberg, 2005. "Market Value and Patent Citations," RAND Journal of Economics, The RAND Corporation, vol. 36(1), pages 16-38, Spring.
    4. repec:fth:harver:1473 is not listed on IDEAS
    5. Goto, Akira & Motohashi, Kazuyuki, 2007. "Construction of a Japanese Patent Database and a first look at Japanese patenting activities," Research Policy, Elsevier, vol. 36(9), pages 1431-1442, November.
    6. Mendonca, Sandro & Pereira, Tiago Santos & Godinho, Manuel Mira, 2004. "Trademarks as an indicator of innovation and industrial change," Research Policy, Elsevier, vol. 33(9), pages 1385-1404, November.
    7. Bronwyn H. Hall & Adam B. Jaffe & Manuel Trajtenberg, 2001. "The NBER Patent Citation Data File: Lessons, Insights and Methodological Tools," NBER Working Papers 8498, National Bureau of Economic Research, Inc.
    8. Grid Thoma & Salvatore Torrisi, 2007. "Creating Powerful Indicators for Innovation Studies with Approximate Matching Algorithms. A test based on PATSTAT and Amadeus databases," KITeS Working Papers 211, KITeS, Centre for Knowledge, Internationalization and Technology Studies, Universita' Bocconi, Milano, Italy, revised Dec 2007.
    9. Gambardella, Alfonso & Harhoff, Dietmar & Verspagen, Bart, 2008. "The Value of European Patents," CEPR Discussion Papers 6848, C.E.P.R. Discussion Papers.
    10. Richard C. Levin & Alvin K. Klevorick & Richard R. Nelson & Sidney G. Winter, 1987. "Appropriating the Returns from Industrial Research and Development," Brookings Papers on Economic Activity, Economic Studies Program, The Brookings Institution, vol. 18(3, Specia), pages 783-832.
    11. Giuri, Paola & Mariani, Myriam, 2007. "Inventors and invention processes in Europe: Results from the PatVal-EU survey," Research Policy, Elsevier, vol. 36(8), pages 1105-1106, October.
    12. Petra Moser, 2005. "How Do Patent Laws Influence Innovation? Evidence from Nineteenth-Century World's Fairs," American Economic Review, American Economic Association, vol. 95(4), pages 1214-1236, September.
    13. Hall, Bronwyn H & Griliches, Zvi & Hausman, Jerry A, 1986. "Patents and R and D: Is There a Lag?," International Economic Review, Department of Economics, University of Pennsylvania and Osaka University Institute of Social and Economic Research Association, vol. 27(2), pages 265-283, June.
    14. Zvi Griliches, 1998. "Patent Statistics as Economic Indicators: A Survey," NBER Chapters, in: R&D and Productivity: The Econometric Evidence, pages 287-343, National Bureau of Economic Research, Inc.
    15. Bronwyn H. Hall & Nathan Rosenberg (ed.), 2010. "Handbook of the Economics of Innovation," Handbook of the Economics of Innovation, Elsevier, edition 1, volume 1, number 1.
    16. Wesley M. Cohen & Richard R. Nelson & John P. Walsh, 2000. "Protecting Their Intellectual Assets: Appropriability Conditions and Why U.S. Manufacturing Firms Patent (or Not)," NBER Working Papers 7552, National Bureau of Economic Research, Inc.
    17. Christine Greenhalgh & Mark Rogers, 2007. "The Value of Intellectual Property Rights to Firms," Economics Series Working Papers 319, University of Oxford, Department of Economics.
    18. Pavitt, Keith & Robson, Michael & Townsend, Joe, 1987. "The Size Distribution of Innovating Firms in the UK: 1945-1983," Journal of Industrial Economics, Wiley Blackwell, vol. 35(3), pages 297-316, March.
    Full references (including those not matched with items on IDEAS)

    Most related items

    These are the items that most often cite the same works as this one and are cited by the same works as this one.
    1. Nagaoka, Sadao & Motohashi, Kazuyuki & Goto, Akira, 2010. "Patent Statistics as an Innovation Indicator," Handbook of the Economics of Innovation, in: Bronwyn H. Hall & Nathan Rosenberg (ed.), Handbook of the Economics of Innovation, edition 1, volume 2, chapter 0, pages 1083-1127, Elsevier.
    2. Adam B. Jaffe & Gaétan de Rassenfosse, 2017. "Patent citation data in social science research: Overview and best practices," Journal of the Association for Information Science & Technology, Association for Information Science & Technology, vol. 68(6), pages 1360-1374, June.
    3. Antoine Dechezleprêtre & Yann Ménière & Myra Mohnen, 2017. "International patent families: from application strategies to statistical indicators," Scientometrics, Springer;Akadémiai Kiadó, vol. 111(2), pages 793-828, May.
    4. Fukugawa, Nobuya, 2012. "Impacts of intangible assets on the initial public offering of biotechnology startups," Economics Letters, Elsevier, vol. 116(1), pages 83-85.
    5. Boeing, Philipp & Mueller, Elisabeth, 2019. "Measuring China's patent quality: Development and validation of ISR indices," China Economic Review, Elsevier, vol. 57(C).
    6. Blanco, Iván & Wehrheim, David, 2017. "The bright side of financial derivatives: Options trading and firm innovation," Journal of Financial Economics, Elsevier, vol. 125(1), pages 99-119.
    7. Stéphane Lhuillery & Julio Raffo & Intan Hamdan-Livramento, 2016. "Measuring creativity: Learning from innovation measurement," WIPO Economic Research Working Papers 31, World Intellectual Property Organization - Economics and Statistics Division.
    8. Choi, Mincheol & Lee, Chang-Yang, 2021. "Technological diversification and R&D productivity: The moderating effects of knowledge spillovers and core-technology competence," Technovation, Elsevier, vol. 104(C).
    9. Alessandra Scandura, 2019. "The role of scientific and market knowledge in the inventive process: evidence from a survey of industrial inventors," The Journal of Technology Transfer, Springer, vol. 44(4), pages 1029-1069, August.
    10. Dirk Czarnitzki & Katrin Hussinger & Bart Leten, 2020. "How Valuable are Patent Blocking Strategies?," Review of Industrial Organization, Springer;The Industrial Organization Society, vol. 56(3), pages 409-434, May.
    11. Felix Bracht & Dennis Verhoeven, 2021. "Air pollution and innovation," CEP Discussion Papers dp1817, Centre for Economic Performance, LSE.
    12. Masatoshi Kato & Koichiro Onishi & Yuji Honjo, 2022. "Does patenting always help new firm survival? Understanding heterogeneity among exit routes," Small Business Economics, Springer, vol. 59(2), pages 449-475, August.
    13. Bronwyn Hall & Christian Helmers & Mark Rogers & Vania Sena, 2014. "The Choice between Formal and Informal Intellectual Property: A Review," Journal of Economic Literature, American Economic Association, vol. 52(2), pages 375-423, June.
    14. Nelson, Andrew J., 2009. "Measuring knowledge spillovers: What patents, licenses and publications reveal about innovation diffusion," Research Policy, Elsevier, vol. 38(6), pages 994-1005, July.
    15. Gao, Wenlian & Chou, Julia, 2015. "Innovation efficiency, global diversification, and firm value," Journal of Corporate Finance, Elsevier, vol. 30(C), pages 278-298.
    16. Albino, Vito & Ardito, Lorenzo & Dangelico, Rosa Maria & Messeni Petruzzelli, Antonio, 2014. "Understanding the development trends of low-carbon energy technologies: A patent analysis," Applied Energy, Elsevier, vol. 135(C), pages 836-854.
    17. Petra Moser & Joerg Ohmstedt & Paul W. Rhode, 2015. "Patent Citations and the Size of the Inventive Step - Evidence from Hybrid Corn," NBER Working Papers 21443, National Bureau of Economic Research, Inc.
    18. Huang, Kenneth Guang-Lih & Huang, Can & Shen, Huijun & Mao, Hao, 2021. "Assessing the value of China's patented inventions," Technological Forecasting and Social Change, Elsevier, vol. 170(C).
    19. Ryan L. Lampe & Petra Moser, 2012. "Do Patent Pools Encourage Innovation? Evidence from 20 U.S. Industries under the New Deal," NBER Working Papers 18316, National Bureau of Economic Research, Inc.
    20. Bertoni, Fabio & Tykvová, Tereza, 2015. "Does governmental venture capital spur invention and innovation? Evidence from young European biotech companies," Research Policy, Elsevier, vol. 44(4), pages 925-935.

    More about this item

    JEL classification:

    • C81 - Mathematical and Quantitative Methods - - Data Collection and Data Estimation Methodology; Computer Programs - - - Methodology for Collecting, Estimating, and Organizing Microeconomic Data; Data Access
    • O34 - Economic Development, Innovation, Technological Change, and Growth - - Innovation; Research and Development; Technological Change; Intellectual Property Rights - - - Intellectual Property and Intellectual Capital

    Statistics

    Access and download statistics

    Corrections

    All material on this site has been provided by the respective publishers and authors. You can help correct errors and omissions. When requesting a correction, please mention this item's handle: RePEc:nbr:nberwo:15851. See general information about how to correct material in RePEc.

    For technical questions regarding this item, or to correct its authors, title, abstract, bibliographic or download information, contact: . General contact details of provider: https://edirc.repec.org/data/nberrus.html .

    If you have authored this item and are not yet registered with RePEc, we encourage you to do it here. This allows to link your profile to this item. It also allows you to accept potential citations to this item that we are uncertain about.

    If CitEc recognized a bibliographic reference but did not link an item in RePEc to it, you can help with this form .

    If you know of missing items citing this one, you can help us creating those links by adding the relevant references in the same way as above, for each refering item. If you are a registered author of this item, you may also want to check the "citations" tab in your RePEc Author Service profile, as there may be some citations waiting for confirmation.

    For technical questions regarding this item, or to correct its authors, title, abstract, bibliographic or download information, contact: (email available below). General contact details of provider: https://edirc.repec.org/data/nberrus.html .

    Please note that corrections may take a couple of weeks to filter through the various RePEc services.

    IDEAS is a RePEc service hosted by the Research Division of the Federal Reserve Bank of St. Louis . RePEc uses bibliographic data supplied by the respective publishers.