IDEAS home Printed from
   My bibliography  Save this paper

“Singling out individual inventors from patent data”


  • Ernest Miguélez

    () (Faculty of Economics, University of Barcelona)

  • Ismael Gómez-Miguélez

    () (Technical University of Catalonia)


An increasing number of studies in recent years have sought to identify individual inventors from patent data. A variety of heuristics have been proposed for using the names and other information disclosed in patent documents to establish “who is who” in patents. This paper contributes to this literature by describing a methodology for identifying inventors using patents applied to the European Patent Office (EPO hereafter). As in much of this literature, we basically follow a three-step procedure: (1) the parsing stage, aimed at reducing the noise in the inventor’s name and other fields of the patent; (2) the matching stage, where name matching algorithms are used to group similar names; and (3) the filtering stage, where additional information and various scoring schemes are used to filter out these similarly-named inventors. The paper presents the results obtained by using the algorithms with the set of European inventors applying to the EPO over a long period of time.

Suggested Citation

  • Ernest Miguélez & Ismael Gómez-Miguélez, 2011. "“Singling out individual inventors from patent data”," IREA Working Papers 201105, University of Barcelona, Research Institute of Applied Economics, revised May 2011.
  • Handle: RePEc:ira:wpaper:201105

    Download full text from publisher

    File URL:
    Download Restriction: no

    Other versions of this item:

    References listed on IDEAS

    1. Ajay Agrawal & Iain Cockburn & John McHale, 2003. "Gone But Not Forgotten: Labor Flows, Knowledge Spillovers, and Enduring Social Capital," NBER Working Papers 9950, National Bureau of Economic Research, Inc.
    2. Jinyoung Kim & Sangjoon John Lee & Gerald Marschke, 2009. "International Knowledge Flows: Evidence from an Inventor-Firm Matched Data Set," NBER Chapters,in: Science and Engineering Careers in the United States: An Analysis of Markets and Employment, pages 321-348 National Bureau of Economic Research, Inc.
    3. Manuel Trajtenberg & Gil Shiff & Ran Melamed, 2009. "The "Names Game": Harnessing Inventors, Patent Data for Economic Research," Annals of Economics and Statistics, GENES, issue 93-94, pages 67-77.
    4. Zvi Griliches, 1998. "Patent Statistics as Economic Indicators: A Survey," NBER Chapters,in: R&D and Productivity: The Econometric Evidence, pages 287-343 National Bureau of Economic Research, Inc.
    5. Bottazzi, Laura & Peri, Giovanni, 2003. "Innovation and spillovers in regions: Evidence from European patent data," European Economic Review, Elsevier, vol. 47(4), pages 687-710, August.
    6. Nicolas CARAYOL (GREThA UMR CNRS 5113) & Lorenzo CASSI (CES, Université Paris 1 Panthéon Sorbonne - CNRS), 2009. "Who\'s Who in Patents. A Bayesian approach," Cahiers du GREThA 2009-07, Groupe de Recherche en Economie Théorique et Appliquée.
    7. Grid Thoma & Salvatore Torrisi, 2007. "Creating Powerful Indicators for Innovation Studies with Approximate Matching Algorithms. A test based on PATSTAT and Amadeus databases," KITeS Working Papers 211, KITeS, Centre for Knowledge, Internationalization and Technology Studies, Universita' Bocconi, Milano, Italy, revised Dec 2007.
    8. Stéphane Maraut & Hélène Dernis & Colin Webb & Vincenzo Spiezia & Dominique Guellec, 2008. "The OECD REGPAT Database: A Presentation," OECD Science, Technology and Industry Working Papers 2008/2, OECD Publishing.
    9. Raffo, Julio & Lhuillery, Stéphane, 2009. "How to play the "Names Game": Patent retrieval comparing different heuristics," Research Policy, Elsevier, vol. 38(10), pages 1617-1627, December.
    Full references (including those not matched with items on IDEAS)


    Citations are extracted by the CitEc Project, subscribe to its RSS feed for this item.

    Cited by:

    1. Ventura, Samuel L. & Nugent, Rebecca & Fuchs, Erica R.H., 2015. "Seeing the non-stars: (Some) sources of bias in past disambiguation approaches and a new public tool leveraging labeled records," Research Policy, Elsevier, vol. 44(9), pages 1672-1701.

    More about this item


    “Names game”; patent data; unique inventors; name matching algorithms. JEL classification:C8; J61; O31; O33; R0.;

    JEL classification:

    • C8 - Mathematical and Quantitative Methods - - Data Collection and Data Estimation Methodology; Computer Programs
    • J61 - Labor and Demographic Economics - - Mobility, Unemployment, Vacancies, and Immigrant Workers - - - Geographic Labor Mobility; Immigrant Workers
    • O31 - Economic Development, Innovation, Technological Change, and Growth - - Innovation; Research and Development; Technological Change; Intellectual Property Rights - - - Innovation and Invention: Processes and Incentives
    • O33 - Economic Development, Innovation, Technological Change, and Growth - - Innovation; Research and Development; Technological Change; Intellectual Property Rights - - - Technological Change: Choices and Consequences; Diffusion Processes
    • R0 - Urban, Rural, Regional, Real Estate, and Transportation Economics - - General

    NEP fields

    This paper has been announced in the following NEP Reports:


    Access and download statistics


    All material on this site has been provided by the respective publishers and authors. You can help correct errors and omissions. When requesting a correction, please mention this item's handle: RePEc:ira:wpaper:201105. See general information about how to correct material in RePEc.

    For technical questions regarding this item, or to correct its authors, title, abstract, bibliographic or download information, contact: (Alicia García). General contact details of provider: .

    If you have authored this item and are not yet registered with RePEc, we encourage you to do it here. This allows to link your profile to this item. It also allows you to accept potential citations to this item that we are uncertain about.

    If CitEc recognized a reference but did not link an item in RePEc to it, you can help with this form .

    If you know of missing items citing this one, you can help us creating those links by adding the relevant references in the same way as above, for each refering item. If you are a registered author of this item, you may also want to check the "citations" tab in your RePEc Author Service profile, as there may be some citations waiting for confirmation.

    Please note that corrections may take a couple of weeks to filter through the various RePEc services.

    IDEAS is a RePEc service hosted by the Research Division of the Federal Reserve Bank of St. Louis . RePEc uses bibliographic data supplied by the respective publishers.