Harmonizing and Combining Large Datasets - An Application to Firm-Level Patent and Accounting Data

Harmonizing and Combining Large Datasets - An Application to Firm-Level Patent and Accounting Data

Author

Listed:

Grid Thoma
Salvatore Torrisi
Alfonso Gambardella
Dominique Guellec
Bronwyn H. Hall
Dietmar Harhoff

Abstract

This paper discusses methods for the harmonization and combination of large-scale patent and trademark datasets with each other and other sources of data. Dictionary- and rule-based approaches to the consolidation of applicant names in patent data are presented and shown to have both benefits and drawbacks in isolation. We combine the two methods and develop a set of rules and dictionaries to consolidate European, Patent Cooperation Treaty (PCT) and US patent data with firm accounting data. The resulting data encompass about 131,000 patent applicant names from 46 countries, covering 58.8 percent of EPO applications and 50.6 percent of PCT applications by business organizations during the time period from 1979 to 2008. For US data, the resulting dataset includes around 54,000 assignee names and 51.3 percent of US granted patents during approximately the same time period.

Suggested Citation

Grid Thoma & Salvatore Torrisi & Alfonso Gambardella & Dominique Guellec & Bronwyn H. Hall & Dietmar Harhoff, 2010. "Harmonizing and Combining Large Datasets - An Application to Firm-Level Patent and Accounting Data," NBER Working Papers 15851, National Bureau of Economic Research, Inc.

Handle: RePEc:nbr:nberwo:15851
Note: PR

Download full text from publisher

References listed on IDEAS

Mendonca, Sandro & Pereira, Tiago Santos & Godinho, Manuel Mira, 2004. "Trademarks as an indicator of innovation and industrial change," Research Policy, Elsevier, vol. 33(9), pages 1385-1404, November.
- Sandro MendonÃ§a & Tiago Santos Pereira & Manuel Mira Godinho, 2004. "Trademarks as an Indicator of Innovation and Industrial Change," LEM Papers Series 2004/15, Laboratory of Economics and Management (LEM), Sant'Anna School of Advanced Studies, Pisa, Italy.
Petra Moser, 2005. "How Do Patent Laws Influence Innovation? Evidence from Nineteenth-Century World's Fairs," American Economic Review, American Economic Association, vol. 95(4), pages 1214-1236, September.
- Petra Moser, 2003. "How Do Patent Laws Influence Innovation? Evidence from Nineteenth-Century World Fairs," NBER Working Papers 9909, National Bureau of Economic Research, Inc.
Fosfuri, Andrea & Giarratana, Marco S., 2004. "Product strategies and startups' survival in turbulent industries: evidence from the security software industry," DEE - Working Papers. Business Economics. WB wb044816, Universidad Carlos III de Madrid. Departamento de EconomÃa de la Empresa.
Hall, Bronwyn H & Griliches, Zvi & Hausman, Jerry A, 1986. "Patents and R and D: Is There a Lag?," International Economic Review, Department of Economics, University of Pennsylvania and Osaka University Institute of Social and Economic Research Association, vol. 27(2), pages 265-283, June.
Christine Greenhalgh & Mark Rogers, 2007. "The value of intellectual property rights to firms and society," Oxford Review of Economic Policy, Oxford University Press and Oxford Review of Economic Policy Limited, vol. 23(4), pages 541-567, Winter.
Bronwyn H. Hall & Adam Jaffe & Manuel Trajtenberg, 2005. "Market Value and Patent Citations," RAND Journal of Economics, The RAND Corporation, vol. 36(1), pages 16-38, Spring.
- Hall, Bronwyn H. & Jaffe, A & Trajtenberg, M, 2005. "Market value and patent citations," Department of Economics, Working Paper Series qt0cs6v2w7, Department of Economics, Institute for Business and Economic Research, UC Berkeley.
Hall, B. & Jaffe, A. & Trajtenberg, M., 2001. "The NBER Patent Citations Data File: Lessons, Insights and Methodological Tools," Papers 2001-29, Tel Aviv.
- Bronwyn H. Hall & Adam B. Jaffe & Manuel Trajtenberg, 2001. "The NBER Patent Citation Data File: Lessons, Insights and Methodological Tools," NBER Working Papers 8498, National Bureau of Economic Research, Inc.
- Hall, Bronwyn & Trajtenberg, Manuel & Jaffe, Adam B, 2001. "The NBER Patent Citations Data File: Lessons, Insights and Methodological Tools," CEPR Discussion Papers 3094, Centre for Economic Policy Research.
repec:fth:harver:1473 is not listed on IDEAS
Zvi Griliches, 1998. "Patent Statistics as Economic Indicators: A Survey," NBER Chapters, in: R&D and Productivity: The Econometric Evidence, pages 287-343, National Bureau of Economic Research, Inc.
- Griliches, Zvi, 1990. "Patent Statistics as Economic Indicators: A Survey," Journal of Economic Literature, American Economic Association, vol. 28(4), pages 1661-1707, December.
- Zvi Griliches, 1990. "Patent Statistics as Economic Indicators: A Survey," NBER Working Papers 3301, National Bureau of Economic Research, Inc.
Goto, Akira & Motohashi, Kazuyuki, 2007. "Construction of a Japanese Patent Database and a first look at Japanese patenting activities," Research Policy, Elsevier, vol. 36(9), pages 1431-1442, November.
Bronwyn H. Hall & Nathan Rosenberg (ed.), 2010. "Handbook of the Economics of Innovation," Handbook of the Economics of Innovation, Elsevier, edition 1, volume 1, number 1.
Christine Greenhalgh & Mark Rogers, 2007. "The Value of Intellectual Property Rights to Firms," Economics Series Working Papers 319, University of Oxford, Department of Economics.
- Christine Greenhalgh & Mark Rogers, 2007. "The Value of Intellectual Property Rights to Firms," Discussion Papers 06-036, Stanford Institute for Economic Policy Research.
Wesley M. Cohen & Richard R. Nelson & John P. Walsh, 2000. "Protecting Their Intellectual Assets: Appropriability Conditions and Why U.S. Manufacturing Firms Patent (or Not)," NBER Working Papers 7552, National Bureau of Economic Research, Inc.
- Wesley M Cohen & Richard R Nelson & John P Walsh, 2003. "Protecting Their Intellectual Assets: Appropriability Conditions and Why U.S. Manufacturing Firms Patent (Or Not)," Levine's Working Paper Archive 618897000000000624, David K. Levine.
Giuri, Paola & Mariani, Myriam & Brusoni, Stefano & Crespi, Gustavo & Francoz, Dominique & Gambardella, Alfonso & Garcia-Fontes, Walter & Geuna, Aldo & Gonzales, Raul & Harhoff, Dietmar & Hoisl, Karin, 2007. "Inventors and invention processes in Europe: Results from the PatVal-EU survey," Research Policy, Elsevier, vol. 36(8), pages 1107-1127, October.
- Giuri, Paola & Mariani, Myriam, 2007. "Inventors and invention processes in Europe: Results from the PatVal-EU survey," Research Policy, Elsevier, vol. 36(8), pages 1105-1106, October.
Grid Thoma & Salvatore Torrisi, 2007. "Creating Powerful Indicators for Innovation Studies with Approximate Matching Algorithms. A test based on PATSTAT and Amadeus databases," KITeS Working Papers 211, KITeS, Centre for Knowledge, Internationalization and Technology Studies, Universita' Bocconi, Milano, Italy, revised Dec 2007.
Harhoff, Dietmar & Gambardella, Alfonso & Verspagen, Bart, 2008. "The Value of European Patents," CEPR Discussion Papers 6848, Centre for Economic Policy Research.
Pavitt, Keith & Robson, Michael & Townsend, Joe, 1987. "The Size Distribution of Innovating Firms in the UK: 1945-1983," Journal of Industrial Economics, Wiley Blackwell, vol. 35(3), pages 297-316, March.
Richard C. Levin & Alvin K. Klevorick & Richard R. Nelson & Sidney G. Winter, 1987. "Appropriating the Returns from Industrial Research and Development," Brookings Papers on Economic Activity, Economic Studies Program, The Brookings Institution, vol. 18(3, Specia), pages 783-832.

Full references (including those not matched with items on IDEAS)

Most related items

These are the items that most often cite the same works as this one and are cited by the same works as this one.

Nagaoka, Sadao & Motohashi, Kazuyuki & Goto, Akira, 2010. "Patent Statistics as an Innovation Indicator," Handbook of the Economics of Innovation, in: Bronwyn H. Hall & Nathan Rosenberg (ed.), Handbook of the Economics of Innovation, edition 1, volume 2, chapter 0, pages 1083-1127, Elsevier.
Boeing, Philipp & Mueller, Elisabeth, 2019. "Measuring China's patent quality: Development and validation of ISR indices," China Economic Review, Elsevier, vol. 57(C).
- Böing, Philipp & Müller, Elisabeth, 2019. "Measuring China's patent quality: Development and validation of ISR indices," ZEW Discussion Papers 19-017, ZEW - Leibniz Centre for European Economic Research.
Choi, Mincheol & Lee, Chang-Yang, 2021. "Technological diversification and R&D productivity: The moderating effects of knowledge spillovers and core-technology competence," Technovation, Elsevier, vol. 104(C).
Adam B. Jaffe & Gaétan de Rassenfosse, 2017. "Patent citation data in social science research: Overview and best practices," Journal of the Association for Information Science & Technology, Association for Information Science & Technology, vol. 68(6), pages 1360-1374, June.
- Adam B. Jaffe & Gaétan de Rassenfosse, 2016. "Patent Citation Data in Social Science Research: Overview and Best Practices," NBER Working Papers 21868, National Bureau of Economic Research, Inc.
Fukugawa, Nobuya, 2012. "Impacts of intangible assets on the initial public offering of biotechnology startups," Economics Letters, Elsevier, vol. 116(1), pages 83-85.
Blanco, Iván & Wehrheim, David, 2017. "The bright side of financial derivatives: Options trading and firm innovation," Journal of Financial Economics, Elsevier, vol. 125(1), pages 99-119.
- Blanco, Iván & Wehrheim, David, 2016. "The Bright Side of Financial Derivatives: Options Trading and Firm Innovation," MPRA Paper 69239, University Library of Munich, Germany.
Antoine Dechezleprêtre & Yann Ménière & Myra Mohnen, 2017. "International patent families: from application strategies to statistical indicators," Scientometrics, Springer;Akadémiai Kiadó, vol. 111(2), pages 793-828, May.
- Antoine Dechezleprêtre & Yann Ménière & Myra Mohnen, 2017. "International patent families: from application strategies to statistical indicators," GRI Working Papers 264, Grantham Research Institute on Climate Change and the Environment.
- Dechezleprêtre, Antoine & Ménière, Yann & Mohnen, Myra, 2017. "International patent families: from application strategies to statistical indicators," LSE Research Online Documents on Economics 69486, London School of Economics and Political Science, LSE Library.
- Antoine Dechezleprêtre & Yann Ménière & Myra Mohnen, 2017. "International patent families: from application strategies to statistical indicators," Post-Print hal-01693881, HAL.
Mohnen, Pierre, 2019. "R&D, innovation and productivity," MERIT Working Papers 2019-016, United Nations University - Maastricht Economic and Social Research Institute on Innovation and Technology (MERIT).
Stephane Lhuillery & Julio Raffo & Intan Hamdan-Livramento, 2016. "Measuring creativity: Learning from innovation measurement," WIPO Economic Research Working Papers 31, World Intellectual Property Organization - Economics and Statistics Division.
Alessandra Scandura, 2019. "The role of scientific and market knowledge in the inventive process: evidence from a survey of industrial inventors," The Journal of Technology Transfer, Springer, vol. 44(4), pages 1029-1069, August.
Dirk Czarnitzki & Katrin Hussinger & Bart Leten, 2020. "How Valuable are Patent Blocking Strategies?," Review of Industrial Organization, Springer;The Industrial Organization Society, vol. 56(3), pages 409-434, May.
Masatoshi Kato & Koichiro Onishi & Yuji Honjo, 2022. "Does patenting always help new firm survival? Understanding heterogeneity among exit routes," Small Business Economics, Springer, vol. 59(2), pages 449-475, August.
Gerald A. Carlino & Robert M. Hunt, 2009. "What explains the quantity and quality of local inventive activity?," Working Papers 09-12, Federal Reserve Bank of Philadelphia.
Higham, Kyle & de Rassenfosse, Gaétan & Jaffe, Adam B., 2021. "Patent Quality: Towards a Systematic Framework for Analysis and Measurement," Research Policy, Elsevier, vol. 50(4).
- Kyle W. Higham & Gaétan de Rassenfosse & Adam B. Jaffe, 2020. "Patent Quality: Towards a Systematic Framework for Analysis and Measurement," NBER Working Papers 27598, National Bureau of Economic Research, Inc.
- Higham, Kyle & de Rassenfosse, Gaetan & Jaffe, Adam B, 2020. "Patent Quality: Towards a Systematic Framework for Analysis and Measurement," SocArXiv 49qxk, Center for Open Science.
- Kyle Higham & Gaetan de Rassenfosse & Adam Jaffe, 2021. "Patent quality: Towards a Systematic Framework for Analysis and Measurement," Working Papers 14, Chair of Science, Technology, and Innovation Policy.
Chiara Pederzoli & Grid Thoma & Costanza Torricelli, 2011. "Modelling credit risk for innovative firms: the role of innovation measures," Centro Studi di Banca e Finanza (CEFIN) (Center for Studies in Banking and Finance) 0025, Universita di Modena e Reggio Emilia, Dipartimento di Economia "Marco Biagi".
Torrisi, Salvatore & Gambardella, Alfonso & Giuri, Paola & Harhoff, Dietmar & Hoisl, Karin & Mariani, Myriam, 2016. "Used, blocking and sleeping patents: Empirical evidence from a large-scale inventor survey," Research Policy, Elsevier, vol. 45(7), pages 1374-1385.
Nelson, Andrew J., 2009. "Measuring knowledge spillovers: What patents, licenses and publications reveal about innovation diffusion," Research Policy, Elsevier, vol. 38(6), pages 994-1005, July.
Deepak Somaya & Ian O. Williamson & Xiaomeng Zhang, 2007. "Combining Patent Law Expertise with R&D for Patenting Performance," Organization Science, INFORMS, vol. 18(6), pages 922-937, December.
He, Zi-Lin & Lim, Kwanghui & Wong, Poh-Kam, 2006. "Entry and competitive dynamics in the mobile telecommunications market," Research Policy, Elsevier, vol. 35(8), pages 1147-1165, October.
Bai, Qing & Tian, Shaonan, 2020. "Innovate or die: Corporate innovation and bankruptcy forecasts," Journal of Empirical Finance, Elsevier, vol. 59(C), pages 88-108.

More about this item

JEL classification:

C81 - Mathematical and Quantitative Methods - - Data Collection and Data Estimation Methodology; Computer Programs - - - Methodology for Collecting, Estimating, and Organizing Microeconomic Data; Data Access
O34 - Economic Development, Innovation, Technological Change, and Growth - - Innovation; Research and Development; Technological Change; Intellectual Property Rights - - - Intellectual Property and Intellectual Capital

Statistics

Access and download statistics

Corrections

All material on this site has been provided by the respective publishers and authors. You can help correct errors and omissions. When requesting a correction, please mention this item's handle: RePEc:nbr:nberwo:15851. See general information about how to correct material in RePEc.

If you have authored this item and are not yet registered with RePEc, we encourage you to do it here. This allows to link your profile to this item. It also allows you to accept potential citations to this item that we are uncertain about.

If CitEc recognized a bibliographic reference but did not link an item in RePEc to it, you can help with this form .

If you know of missing items citing this one, you can help us creating those links by adding the relevant references in the same way as above, for each refering item. If you are a registered author of this item, you may also want to check the "citations" tab in your RePEc Author Service profile, as there may be some citations waiting for confirmation.

For technical questions regarding this item, or to correct its authors, title, abstract, bibliographic or download information, contact: the person in charge (email available below). General contact details of provider: https://edirc.repec.org/data/nberrus.html .

Please note that corrections may take a couple of weeks to filter through the various RePEc services.

IDEAS is a RePEc service. RePEc uses bibliographic data supplied by the respective publishers.

Browse Econ Literature

More features

Harmonizing and Combining Large Datasets - An Application to Firm-Level Patent and Accounting Data

Author

Abstract

Suggested Citation

Download full text from publisher

References listed on IDEAS

Most related items

More about this item

JEL classification:

Statistics

Corrections

More services and features

MyIDEAS

Author registration

Rankings

RePEc Genealogy

RePEc Biblio

MPRA

New papers by email

EconAcademics

Plagiarism

About RePEc

RePEc home

Blog

Help/FAQ

RePEc team

Participating archives

Privacy statement

Help us

Corrections

Volunteers

Get papers listed

Open a RePEc archive

Get RePEc data