How To Kill Inventors: Testing The Massacrator© Algorithm For Inventor Disambiguation
Inventor disambiguation is an increasingly important issue for users of patent data. We propose and test a number of refinements to the Massacrator© algorithm, originally proposed by Lissoni et al. (2006) and now applied to APE-INV, a free access database funded by the European Science Foundation. Following Raffo and Lhuillery (2009) we describe disambiguation as a 3-step process: cleaning&parsing, matching, and filtering. By means of sensitivity analysis, based on MonteCarlo simulations, we show how various filtering criteria can be manipulated in order to obtain optimal combinations of precision and recall (type I and type II errors). We also show how these different combinations generate different results for applications to studies on inventors\' productivity, mobility, and networking. The filtering criteria based upon information on inventors\' addresses are sensitive to data quality, while those based upon information on co-inventorship networks are always effective. Details on data access and data quality improvement via feedback collection are also discussed.
|Date of creation:||2012|
|Contact details of provider:|| Postal: Avenue Léon Duguit, 33608 Pessac Cedex|
Phone: +33 (0)18.104.22.168.75
Fax: +33 (0)22.214.171.124.47
Web page: http://gretha.u-bordeaux4.fr/
More information through EDIRC
Please report citation or reference errors to , or , if you are the registered author of the cited work, log in to your RePEc Author Service profile, click on "citations" and make appropriate adjustments.:
- Lorenzo Cassi & Nicolas Carayol, 2009.
"Who's Who in Patents. A Bayesian approach,"
Université Paris1 Panthéon-Sorbonne (Post-Print and Working Papers)
- Nicolas CARAYOL (GREThA UMR CNRS 5113) & Lorenzo CASSI (CES, Université Paris 1 Panthéon Sorbonne - CNRS), 2009. "Who\'s Who in Patents. A Bayesian approach," Cahiers du GREThA 2009-07, Groupe de Recherche en Economie Théorique et Appliquée.
- Gerald Marschke, 2006. "The influence of university research on industrial innovation," Proceedings, Federal Reserve Bank of Cleveland.
- Jinyoung Kim & Sangjoon John Lee & Gerald Marschke, 2005. "The Influence of University Research on Industrial Innovation," NBER Working Papers 11447, National Bureau of Economic Research, Inc.
- Jinyoung Kim & Sangjoon John Lee & Gerald Marschke, 2010. "The Influence of University Research on Industrial Innovation," Discussion Paper Series 1006, Institute of Economic Research, Korea University.
- Grid Thoma & Salvatore Torrisi, 2007. "Creating Powerful Indicators for Innovation Studies with Approximate Matching Algorithms. A test based on PATSTAT and Amadeus databases," KITeS Working Papers 211, KITeS, Centre for Knowledge, Internationalization and Technology Studies, Universita' Bocconi, Milano, Italy, revised Dec 2007.
- repec:cmi:wpaper:cemi-workingpaper-2009-006 is not listed on IDEAS
- Stefano Breschi & Francesco Lissoni, 2009. "Mobility of skilled workers and co-invention networks: an anatomy of localized knowledge flows," Journal of Economic Geography, Oxford University Press, vol. 9(4), pages 439-468, July.
- Grid Thoma & Salvatore Torrisi & Alfonso Gambardella & Dominique Guellec & Bronwyn H. Hall & Dietmar Harhoff, 2010. "Harmonizing and Combining Large Datasets - An Application to Firm-Level Patent and Accounting Data," NBER Working Papers 15851, National Bureau of Economic Research, Inc.
- Matt Marx & Deborah Strumsky & Lee Fleming, 2009. "Mobility, Skills, and the Michigan Non-Compete Experiment," Management Science, INFORMS, vol. 55(6), pages 875-889, June.
- Stefano Breschi & Francesco Lissoni & Fabio Montobbio, 2006. "University patenting and scientific productivity. A quantitative study of Italian academic inventors," KITeS Working Papers 189, KITeS, Centre for Knowledge, Internationalization and Technology Studies, Universita' Bocconi, Milano, Italy, revised Nov 2006.
- Raffo, Julio & Lhuillery, Stéphane, 2009. "How to play the "Names Game": Patent retrieval comparing different heuristics," Research Policy, Elsevier, vol. 38(10), pages 1617-1627, December. Full references (including those not matched with items on IDEAS)