IDEAS home Printed from https://ideas.repec.org/p/grt/wpegrt/2012-29.html
   My bibliography  Save this paper

How To Kill Inventors: Testing The Massacrator© Algorithm For Inventor Disambiguation

Author

Listed:
  • Michele PEZZONI (University of Milano-Bicocca - KiTES-Università Bocconi - Observatoire des Sciences et des Techniques)
  • Francesco LISSONI (GREThA, CNRS, UMR 5113 - KiTES)
  • Gianluca TARASCONI (KiTES, Università Bocconi)

Abstract

Inventor disambiguation is an increasingly important issue for users of patent data. We propose and test a number of refinements to the Massacrator© algorithm, originally proposed by Lissoni et al. (2006) and now applied to APE-INV, a free access database funded by the European Science Foundation. Following Raffo and Lhuillery (2009) we describe disambiguation as a 3-step process: cleaning&parsing, matching, and filtering. By means of sensitivity analysis, based on MonteCarlo simulations, we show how various filtering criteria can be manipulated in order to obtain optimal combinations of precision and recall (type I and type II errors). We also show how these different combinations generate different results for applications to studies on inventors\' productivity, mobility, and networking. The filtering criteria based upon information on inventors\' addresses are sensitive to data quality, while those based upon information on co-inventorship networks are always effective. Details on data access and data quality improvement via feedback collection are also discussed.

Suggested Citation

  • Michele PEZZONI (University of Milano-Bicocca - KiTES-Università Bocconi - Observatoire des Sciences et des Techniques) & Francesco LISSONI (GREThA, CNRS, UMR 5113 - KiTES) & Gianluca TARASCONI (KiTES, 2012. "How To Kill Inventors: Testing The Massacrator© Algorithm For Inventor Disambiguation," Cahiers du GREThA 2012-29, Groupe de Recherche en Economie Théorique et Appliquée.
  • Handle: RePEc:grt:wpegrt:2012-29
    as

    Download full text from publisher

    File URL: http://cahiersdugretha.u-bordeaux4.fr/2012/2012-29.pdf
    Download Restriction: no

    Other versions of this item:

    References listed on IDEAS

    as
    1. Nagaoka, Sadao & Motohashi, Kazuyuki & Goto, Akira, 2010. "Patent Statistics as an Innovation Indicator," Handbook of the Economics of Innovation, Elsevier.
    2. Lorenzo Cassi & Nicolas Carayol, 2009. "Who's Who in Patents. A Bayesian approach," Université Paris1 Panthéon-Sorbonne (Post-Print and Working Papers) hal-00631750, HAL.
    3. Zvi Griliches, 1998. "Patent Statistics as Economic Indicators: A Survey," NBER Chapters,in: R&D and Productivity: The Econometric Evidence, pages 287-343 National Bureau of Economic Research, Inc.
    4. Francesco Lissoni & Patrick Llerena & Maureen McKelvey & Bulat Sanditov, 2008. "Academic patenting in Europe: new evidence from the KEINS database," Research Evaluation, Oxford University Press, vol. 17(2), pages 87-102, June.
    5. Francesco Lissoni & Bulat Sanditov & Gianluca Tarasconi, 2006. "The Keins Database on Academic Inventors: Methodology and Contents," KITeS Working Papers 181, KITeS, Centre for Knowledge, Internationalization and Technology Studies, Universita' Bocconi, Milano, Italy, revised Sep 2006.
    6. repec:fth:harver:1473 is not listed on IDEAS
    7. Gerald Marschke, 2006. "The influence of university research on industrial innovation," Proceedings, Federal Reserve Bank of Cleveland.
    8. Pierre Azoulay & Waverly Ding & Toby Stuart, 2009. "THE IMPACT OF ACADEMIC PATENTING ON THE RATE, QUALITY AND DIRECTION OF (PUBLIC) RESEARCH OUTPUT -super-," Journal of Industrial Economics, Wiley Blackwell, vol. 57(4), pages 637-676, December.
    9. Grid Thoma & Salvatore Torrisi, 2007. "Creating Powerful Indicators for Innovation Studies with Approximate Matching Algorithms. A test based on PATSTAT and Amadeus databases," KITeS Working Papers 211, KITeS, Centre for Knowledge, Internationalization and Technology Studies, Universita' Bocconi, Milano, Italy, revised Dec 2007.
    10. repec:cmi:wpaper:cemi-workingpaper-2009-006 is not listed on IDEAS
    11. Lorenzo Cassi & Nicolas Carayol, 2009. "Who's Who in Patents. A Bayesian approach," Working Papers hal-00631750, HAL.
    12. Francesco Lissoni & Michele Pezzoni & Bianca Poti` & Sandra Romagnosi, 2013. "University Autonomy, the Professor Privilege and Academic Patenting: Italy, 1996--2007," Industry and Innovation, Taylor & Francis Journals, vol. 20(5), pages 399-421, July.
    13. Lee Fleming & Charles King & Adam I. Juda, 2007. "Small Worlds and Regional Innovation," Organization Science, INFORMS, vol. 18(6), pages 938-954, December.
    14. Stefano Breschi & Francesco Lissoni, 2009. "Mobility of skilled workers and co-invention networks: an anatomy of localized knowledge flows," Journal of Economic Geography, Oxford University Press, vol. 9(4), pages 439-468, July.
    15. Grid Thoma & Salvatore Torrisi & Alfonso Gambardella & Dominique Guellec & Bronwyn H. Hall & Dietmar Harhoff, 2010. "Harmonizing and Combining Large Datasets - An Application to Firm-Level Patent and Accounting Data," NBER Working Papers 15851, National Bureau of Economic Research, Inc.
    16. Li, Guan-Cheng & Lai, Ronald & D’Amour, Alexander & Doolin, David M. & Sun, Ye & Torvik, Vetle I. & Yu, Amy Z. & Fleming, Lee, 2014. "Disambiguation and co-authorship networks of the U.S. patent inventor database (1975–2010)," Research Policy, Elsevier, vol. 43(6), pages 941-955.
    17. Matt Marx & Deborah Strumsky & Lee Fleming, 2009. "Mobility, Skills, and the Michigan Non-Compete Experiment," Management Science, INFORMS, vol. 55(6), pages 875-889, June.
    18. Stefano Breschi & Francesco Lissoni & Fabio Montobbio, 2006. "University patenting and scientific productivity. A quantitative study of Italian academic inventors," KITeS Working Papers 189, KITeS, Centre for Knowledge, Internationalization and Technology Studies, Universita' Bocconi, Milano, Italy, revised Nov 2006.
    19. Raffo, Julio & Lhuillery, Stéphane, 2009. "How to play the "Names Game": Patent retrieval comparing different heuristics," Research Policy, Elsevier, vol. 38(10), pages 1617-1627, December.
    20. Balconi, Margherita & Breschi, Stefano & Lissoni, Francesco, 2004. "Networks of inventors and the role of academia: an exploration of Italian patent data," Research Policy, Elsevier, vol. 33(1), pages 127-145, January.
    Full references (including those not matched with items on IDEAS)

    Citations

    Citations are extracted by the CitEc Project, subscribe to its RSS feed for this item.
    as


    Cited by:

    1. Nicolas CARAYOL & Lorenzo CASSI & Pascale ROUX, 2014. "Unintended triadic closure in social networks: The strategic formation of research collaborations between French inventors," Cahiers du GREThA 2014-13, Groupe de Recherche en Economie Théorique et Appliquée.
    2. Stefano Breschi & Francesco Lissoni & Gianluca Tarasconi, 2014. "Inventor Data for Research on Migration and Innovation: A Survey and a Pilot," WIPO Economic Research Working Papers 17, World Intellectual Property Organization - Economics and Statistics Division.
    3. Laurent Bergé & Nicolas Carayol & Pascale Roux, 2017. "How do inventor networks affect urban invention?," CREA Discussion Paper Series 17-03, Center for Research in Economic Analysis, University of Luxembourg.
    4. Clément Gorin, 2017. "Accessibility, absorptive capacity and innovation in European urban areas," Working Papers halshs-01584111, HAL.
    5. Akcigit, Ufuk & Caicedo Soler, Santiago & Miguelez, Ernest & Stantcheva, Stefanie & Sterzi, Valerio, 2018. "Dancing with the Stars: Innovation through Interactions," CEPR Discussion Papers 12819, C.E.P.R. Discussion Papers.
    6. Orsatti, Gianluca & Pezzoni, Michele & Quatraro, Francesco, 2017. "Where Do Green Technologies Come From? Inventor Teams’ Recombinant Capabilities and the Creation of New Knowledge," Department of Economics and Statistics Cognetti de Martiis. Working Papers 201711, University of Turin.
    7. Francesco Lissoni & Michele Pezzoni & Bianca Poti` & Sandra Romagnosi, 2013. "University Autonomy, the Professor Privilege and Academic Patenting: Italy, 1996--2007," Industry and Innovation, Taylor & Francis Journals, vol. 20(5), pages 399-421, July.
    8. Tarasconi, Gianluca & Kang, Byeongwoo, 2015. "PATSTAT revisited," IDE Discussion Papers 527, Institute of Developing Economies, Japan External Trade Organization(JETRO).
    9. repec:spr:scient:v:101:y:2014:i:1:d:10.1007_s11192-014-1409-1 is not listed on IDEAS
    10. Dieter F. Kogler & Jürgen Essletzbichler & David L. Rigby, 2017. "The evolution of specialization in the EU15 knowledge space," Journal of Economic Geography, Oxford University Press, vol. 17(2), pages 345-373.
    11. Clément Gorin, 2017. "Accessibility, absorptive capacity and innovation in European urban areas," Working Papers 1722, Groupe d'Analyse et de Théorie Economique Lyon St-Étienne (GATE Lyon St-Étienne), Université de Lyon.
    12. Stéphane Maraut & Catalina Martínez, 2014. "Identifying author–inventors from Spain: methods and a first insight into results," Scientometrics, Springer;Akadémiai Kiadó, vol. 101(1), pages 445-476, October.
    13. repec:oup:jecgeo:v:17:y:2017:i:5:p:1009-1038. is not listed on IDEAS
    14. Li, Guan-Cheng & Lai, Ronald & D’Amour, Alexander & Doolin, David M. & Sun, Ye & Torvik, Vetle I. & Yu, Amy Z. & Fleming, Lee, 2014. "Disambiguation and co-authorship networks of the U.S. patent inventor database (1975–2010)," Research Policy, Elsevier, vol. 43(6), pages 941-955.
    15. repec:spr:scient:v:113:y:2017:i:3:d:10.1007_s11192-017-2536-2 is not listed on IDEAS

    More about this item

    Keywords

    patent data; inventors; name disambiguation;

    JEL classification:

    • C15 - Mathematical and Quantitative Methods - - Econometric and Statistical Methods and Methodology: General - - - Statistical Simulation Methods: General
    • C81 - Mathematical and Quantitative Methods - - Data Collection and Data Estimation Methodology; Computer Programs - - - Methodology for Collecting, Estimating, and Organizing Microeconomic Data; Data Access
    • O34 - Economic Development, Innovation, Technological Change, and Growth - - Innovation; Research and Development; Technological Change; Intellectual Property Rights - - - Intellectual Property and Intellectual Capital

    NEP fields

    This paper has been announced in the following NEP Reports:

    Statistics

    Access and download statistics

    Corrections

    All material on this site has been provided by the respective publishers and authors. You can help correct errors and omissions. When requesting a correction, please mention this item's handle: RePEc:grt:wpegrt:2012-29. See general information about how to correct material in RePEc.

    For technical questions regarding this item, or to correct its authors, title, abstract, bibliographic or download information, contact: (Valerio Sterzi). General contact details of provider: http://edirc.repec.org/data/ifredfr.html .

    If you have authored this item and are not yet registered with RePEc, we encourage you to do it here. This allows to link your profile to this item. It also allows you to accept potential citations to this item that we are uncertain about.

    If CitEc recognized a reference but did not link an item in RePEc to it, you can help with this form .

    If you know of missing items citing this one, you can help us creating those links by adding the relevant references in the same way as above, for each refering item. If you are a registered author of this item, you may also want to check the "citations" tab in your RePEc Author Service profile, as there may be some citations waiting for confirmation.

    Please note that corrections may take a couple of weeks to filter through the various RePEc services.

    IDEAS is a RePEc service hosted by the Research Division of the Federal Reserve Bank of St. Louis . RePEc uses bibliographic data supplied by the respective publishers.