Harmonizing and Combining Large Datasets - An Application to Firm-Level Patent and Accounting Data
This paper discusses methods for the harmonization and combination of large-scale patent and trademark datasets with each other and other sources of data. Dictionary- and rule-based approaches to the consolidation of applicant names in patent data are presented and shown to have both benefits and drawbacks in isolation. We combine the two methods and develop a set of rules and dictionaries to consolidate European, Patent Cooperation Treaty (PCT) and US patent data with firm accounting data. The resulting data encompass about 131,000 patent applicant names from 46 countries, covering 58.8 percent of EPO applications and 50.6 percent of PCT applications by business organizations during the time period from 1979 to 2008. For US data, the resulting dataset includes around 54,000 assignee names and 51.3 percent of US granted patents during approximately the same time period.
|Date of creation:||Mar 2010|
|Contact details of provider:|| Postal: National Bureau of Economic Research, 1050 Massachusetts Avenue Cambridge, MA 02138, U.S.A.|
Web page: http://www.nber.org
More information through EDIRC
References listed on IDEAS
Please report citation or reference errors to , or , if you are the registered author of the cited work, log in to your RePEc Author Service profile, click on "citations" and make appropriate adjustments.:
- Giarratana, Marco S. & Fosfuri, Andrea, 2004. "Product strategies and startups' survival in turbulent industries: evidence from the security software industry," DEE - Working Papers. Business Economics. WB wb044816, Universidad Carlos III de Madrid. Departamento de Economía de la Empresa.
- Sandro Mendonça & Tiago Santos Pereira & Manuel Mira Godinho, 2004.
"Trademarks as an Indicator of Innovation and Industrial Change,"
LEM Papers Series
2004/15, Laboratory of Economics and Management (LEM), Sant'Anna School of Advanced Studies, Pisa, Italy.
- Mendonca, Sandro & Pereira, Tiago Santos & Godinho, Manuel Mira, 2004. "Trademarks as an indicator of innovation and industrial change," Research Policy, Elsevier, vol. 33(9), pages 1385-1404, November.
- Christine Greenhalgh & Mark Rogers, 2007.
"The Value of Intellectual Property Rights to Firms,"
Economics Series Working Papers
319, University of Oxford, Department of Economics.
- Christine Greenhalgh & Mark Rogers, 2007. "The Value of Intellectual Property Rights to Firms," Discussion Papers 06-036, Stanford Institute for Economic Policy Research.
- Zvi Griliches, 1990.
"Patent Statistics as Economic Indicators: A Survey,"
NBER Working Papers
3301, National Bureau of Economic Research, Inc.
- Zvi Griliches, 1998. "Patent Statistics as Economic Indicators: A Survey," NBER Chapters, in: R&D and Productivity: The Econometric Evidence, pages 287-343 National Bureau of Economic Research, Inc.
- Griliches, Zvi, 1990. "Patent Statistics as Economic Indicators: A Survey," Journal of Economic Literature, American Economic Association, vol. 28(4), pages 1661-1707, December.
- Petra Moser, 2005.
"How Do Patent Laws Influence Innovation? Evidence from Nineteenth-Century World's Fairs,"
American Economic Review,
American Economic Association, vol. 95(4), pages 1214-1236, September.
- Petra Moser, 2003. "How Do Patent Laws Influence Innovation? Evidence from Nineteenth-Century World Fairs," NBER Working Papers 9909, National Bureau of Economic Research, Inc.
- Hall, B. & Jaffe, A. & Trajtenberg, M., 2001.
"The NBER Patent Citations Data File: Lessons, Insights and Methodological Tools,"
2001-29, Tel Aviv.
- Bronwyn H. Hall & Adam B. Jaffe & Manuel Trajtenberg, 2001. "The NBER Patent Citation Data File: Lessons, Insights and Methodological Tools," NBER Working Papers 8498, National Bureau of Economic Research, Inc.
- Hall, Bronwyn H & Jaffe, Adam B & Trajtenberg, Manuel, 2001. "The NBER Patent Citations Data File: Lessons, Insights and Methodological Tools," CEPR Discussion Papers 3094, C.E.P.R. Discussion Papers.
- Giuri, Paola & Mariani, Myriam, 2007.
"Inventors and invention processes in Europe: Results from the PatVal-EU survey,"
Elsevier, vol. 36(8), pages 1105-1106, October.
- Giuri, Paola & Mariani, Myriam & Brusoni, Stefano & Crespi, Gustavo & Francoz, Dominique & Gambardella, Alfonso & Garcia-Fontes, Walter & Geuna, Aldo & Gonzales, Raul & Harhoff, Dietmar & Hoisl, Karin, 2007. "Inventors and invention processes in Europe: Results from the PatVal-EU survey," Research Policy, Elsevier, vol. 36(8), pages 1107-1127, October.
- repec:fth:harver:1473 is not listed on IDEAS
- Richard C. Levin & Alvin K. Klevorick & Richard R. Nelson & Sidney G. Winter, 1987. "Appropriating the Returns from Industrial Research and Development," Brookings Papers on Economic Activity, Economic Studies Program, The Brookings Institution, vol. 18(3), pages 783-832.
- Pavitt, Keith & Robson, Michael & Townsend, Joe, 1987. "The Size Distribution of Innovating Firms in the UK: 1945-1983," Journal of Industrial Economics, Wiley Blackwell, vol. 35(3), pages 297-316, March.
- Christine Greenhalgh & Mark Rogers, 2007. "The value of intellectual property rights to firms and society," Oxford Review of Economic Policy, Oxford University Press, vol. 23(4), pages 541-567, Winter.
- Bronwyn H. Hall & Adam Jaffe & Manuel Trajtenberg, 2005. "Market Value and Patent Citations," RAND Journal of Economics, The RAND Corporation, vol. 36(1), pages 16-38, Spring.
- Grid Thoma & Salvatore Torrisi, 2007. "Creating Powerful Indicators for Innovation Studies with Approximate Matching Algorithms. A test based on PATSTAT and Amadeus databases," KITeS Working Papers 211, KITeS, Centre for Knowledge, Internationalization and Technology Studies, Universita' Bocconi, Milano, Italy, revised Dec 2007.
When requesting a correction, please mention this item's handle: RePEc:nbr:nberwo:15851. See general information about how to correct material in RePEc.
For technical questions regarding this item, or to correct its authors, title, abstract, bibliographic or download information, contact: ()
If references are entirely missing, you can add them using this form.