IDEAS home Printed from
MyIDEAS: Login to save this paper or follow this series

Singling out individual inventors from patent data

  • Ernest Miguélez


    (AQR-IREA. Department of Econometrics, Statistics and Spanish Economy. University of Barcelona, Av. Diagonal 690, 08034 Barcelona, Spain)

  • Ismael Gómez-Miguélez


    (Signal Theory and Communications Department. Technical University of Catalonia, c/ Jordi Girona 1-3, 08034 Barcelona, Spain.)

An increasing number of studies have sprung up in recent years seeking to identify individual inventors from patent data. Different heuristics have been suggested to use their names and other information disclosed in patent documents in order to find out “who is who” in patents. This paper contributes to this literature by setting forth a methodology to identify them using patents applied to the European Patent Office (EPO hereafter). As in the large part of this literature, we basically follow a three-steps procedure: (1) the parsing stage, aimed at reducing the noise in the inventor’s name and other fields of the patent; (2) the matching stage, where name matching algorithms are used to group possible similar names; (3) the filtering stage, where additional information and different scoring schemes are used to filter out these potential same inventors. The paper includes some figures resulting of applying the algorithms to the set of European inventors applying to the EPO for a large period of time.

If you experience problems downloading a file, check if you have the proper application to view it first. In case of further problems read the IDEAS help page. Note that these files are not on the IDEAS site. Please be patient as the files may be large.

File URL:
File Function: First version, 2011
Download Restriction: no

File URL:
File Function: Revised version, 2011
Download Restriction: no

Paper provided by Xarxa de Referència en Economia Aplicada (XREAP) in its series Working Papers with number XREAP2011-03.

in new window

Length: 46 pages
Date of creation: May 2011
Date of revision: May 2011
Handle: RePEc:xrp:wpaper:xreap2011-03
Contact details of provider: Postal: Espai de Recerca en Economia, Facultat de Ciències Econòmiques i Empresarials, Universitat de Barcelona, c/ Tinent Coronel Valenzuela, 1-11, 08034 Barcelona
Phone: +34+934039653
Web page:

More information through EDIRC

References listed on IDEAS
Please report citation or reference errors to , or , if you are the registered author of the cited work, log in to your RePEc Author Service profile, click on "citations" and make appropriate adjustments.:

as in new window
  1. Melamed, Ran & Shiff, Gil & Trajtenberg, Manuel, 2006. "The 'Names Game': Harnessing Inventors Patent Data for Economic Research," CEPR Discussion Papers 5833, C.E.P.R. Discussion Papers.
  2. Ajay Agrawal & Iain Cockburn & John McHale, 2003. "Gone But Not Forgotten: Labor Flows, Knowledge Spillovers, and Enduring Social Capital," NBER Working Papers 9950, National Bureau of Economic Research, Inc.
  3. Jinyoung Kim & Sangjoon John Lee & Gerald Marschke, 2009. "International Knowledge Flows: Evidence from an Inventor-Firm Matched Data Set," NBER Chapters, in: Science and Engineering Careers in the United States: An Analysis of Markets and Employment, pages 321-348 National Bureau of Economic Research, Inc.
  4. Zvi Griliches, 1998. "Patent Statistics as Economic Indicators: A Survey," NBER Chapters, in: R&D and Productivity: The Econometric Evidence, pages 287-343 National Bureau of Economic Research, Inc.
  5. Raffo, Julio & Lhuillery, Stéphane, 2009. "How to play the "Names Game": Patent retrieval comparing different heuristics," Research Policy, Elsevier, vol. 38(10), pages 1617-1627, December.
  6. Lorenzo Cassi & Nicolas Carayol, 2009. "Who's Who in Patents. A Bayesian approach," Université Paris1 Panthéon-Sorbonne (Post-Print and Working Papers) hal-00631750, HAL.
  7. Stéphane Maraut & Hélène Dernis & Colin Webb & Vincenzo Spiezia & Dominique Guellec, 2008. "The OECD REGPAT Database: A Presentation," OECD Science, Technology and Industry Working Papers 2008/2, OECD Publishing.
  8. Laura Bottazzi & Giovanni Peri, . "Innovation and Spillovers in Regions: Evidence from European Patent Data," Working Papers 215, IGIER (Innocenzo Gasparini Institute for Economic Research), Bocconi University.
  9. Grid Thoma & Salvatore Torrisi, 2007. "Creating Powerful Indicators for Innovation Studies with Approximate Matching Algorithms. A test based on PATSTAT and Amadeus databases," KITeS Working Papers 211, KITeS, Centre for Knowledge, Internationalization and Technology Studies, Universita' Bocconi, Milano, Italy, revised Dec 2007.
Full references (including those not matched with items on IDEAS)

This item is not listed on Wikipedia, on a reading list or among the top items on IDEAS.

When requesting a correction, please mention this item's handle: RePEc:xrp:wpaper:xreap2011-03. See general information about how to correct material in RePEc.

For technical questions regarding this item, or to correct its authors, title, abstract, bibliographic or download information, contact: ()

If you have authored this item and are not yet registered with RePEc, we encourage you to do it here. This allows to link your profile to this item. It also allows you to accept potential citations to this item that we are uncertain about.

If references are entirely missing, you can add them using this form.

If the full references list an item that is present in RePEc, but the system did not link to it, you can help with this form.

If you know of missing items citing this one, you can help us creating those links by adding the relevant references in the same way as above, for each refering item. If you are a registered author of this item, you may also want to check the "citations" tab in your profile, as there may be some citations waiting for confirmation.

Please note that corrections may take a couple of weeks to filter through the various RePEc services.

This information is provided to you by IDEAS at the Research Division of the Federal Reserve Bank of St. Louis using RePEc data.