IDEAS home Printed from https://ideas.repec.org/p/hal/journl/halshs-01074536.html

How to kill inventors: testing the Massacrator© algorithm for inventor disambiguation

Author

Listed:
  • Michele Pezzoni

    (CRIOS - Center for Research on Innovation, Organization and Strategy (University of Bocconi))

  • Francesco Lissoni

    (GREThA - Groupe de Recherche en Economie Théorique et Appliquée - UB - Université de Bordeaux - CNRS - Centre National de la Recherche Scientifique)

  • Gianluca Tarasconi

Abstract

Inventor disambiguation is an increasingly important issue for users of patent data. We propose and test a number of refinements to the original Massacrator algorithm, originally proposed by Lissoni et al. (The keins database on academic inventors: methodology and contents, 2006) and now applied to APE-INV, a free access database funded by the European Science Foundation. Following Raffo and Lhuillery (Res Policy 38:1617–1627, 2009) we describe disambiguation as a three step process: cleaning&parsing, matching, and filtering. By means of sensitivity analysis, based on MonteCarlo simulations, we show how various filtering criteria can be manipulated in order to obtain optimal combinations of precision and recall (type I and type II errors). We also show how these different combinations generate different results for applications to studies on inventors' productivity, mobility, and networking; and discuss quality issues related to linguistic issues. The filtering criteria based upon information on inventors' addresses are sensitive to data quality, while those based upon information on co-inventorship networks are always effective. Details on data access and data quality improvement via feedback collection are also discussed.

Suggested Citation

  • Michele Pezzoni & Francesco Lissoni & Gianluca Tarasconi, 2014. "How to kill inventors: testing the Massacrator© algorithm for inventor disambiguation," Post-Print halshs-01074536, HAL.
  • Handle: RePEc:hal:journl:halshs-01074536
    DOI: 10.1007/s11192-014-1375-7
    as

    Download full text from publisher

    To our knowledge, this item is not available for download. To find whether it is available, there are three options:
    1. Check below whether another version of this item is available online.
    2. Check on the provider's web page whether it is in fact available.
    3. Perform a
    for a similarly titled item that would be available.

    Other versions of this item:

    More about this item

    Keywords

    ;

    JEL classification:

    • C15 - Mathematical and Quantitative Methods - - Econometric and Statistical Methods and Methodology: General - - - Statistical Simulation Methods: General
    • C81 - Mathematical and Quantitative Methods - - Data Collection and Data Estimation Methodology; Computer Programs - - - Methodology for Collecting, Estimating, and Organizing Microeconomic Data; Data Access
    • O34 - Economic Development, Innovation, Technological Change, and Growth - - Innovation; Research and Development; Technological Change; Intellectual Property Rights - - - Intellectual Property and Intellectual Capital

    Statistics

    Access and download statistics

    Corrections

    All material on this site has been provided by the respective publishers and authors. You can help correct errors and omissions. When requesting a correction, please mention this item's handle: RePEc:hal:journl:halshs-01074536. See general information about how to correct material in RePEc.

    If you have authored this item and are not yet registered with RePEc, we encourage you to do it here. This allows to link your profile to this item. It also allows you to accept potential citations to this item that we are uncertain about.

    We have no bibliographic references for this item. You can help adding them by using this form .

    If you know of missing items citing this one, you can help us creating those links by adding the relevant references in the same way as above, for each refering item. If you are a registered author of this item, you may also want to check the "citations" tab in your RePEc Author Service profile, as there may be some citations waiting for confirmation.

    For technical questions regarding this item, or to correct its authors, title, abstract, bibliographic or download information, contact: CCSD (email available below). General contact details of provider: https://hal.archives-ouvertes.fr/ .

    Please note that corrections may take a couple of weeks to filter through the various RePEc services.

    IDEAS is a RePEc service. RePEc uses bibliographic data supplied by the respective publishers.