IDEAS home Printed from https://ideas.repec.org/p/hal/journl/halshs-01074536.html
   My bibliography  Save this paper

How to kill inventors: testing the Massacrator© algorithm for inventor disambiguation

Author

Listed:
  • Michele Pezzoni

    (CRIOS - Center for Research on Innovation, Organization and Strategy (University of Bocconi))

  • Francesco Lissoni

    (GREThA - Groupe de Recherche en Economie Théorique et Appliquée - UB - Université de Bordeaux - CNRS - Centre National de la Recherche Scientifique)

  • Gianluca Tarasconi

Abstract

Inventor disambiguation is an increasingly important issue for users of patent data. We propose and test a number of refinements to the original Massacrator algorithm, originally proposed by Lissoni et al. (The keins database on academic inventors: methodology and contents, 2006) and now applied to APE-INV, a free access database funded by the European Science Foundation. Following Raffo and Lhuillery (Res Policy 38:1617–1627, 2009) we describe disambiguation as a three step process: cleaning&parsing, matching, and filtering. By means of sensitivity analysis, based on MonteCarlo simulations, we show how various filtering criteria can be manipulated in order to obtain optimal combinations of precision and recall (type I and type II errors). We also show how these different combinations generate different results for applications to studies on inventors' productivity, mobility, and networking; and discuss quality issues related to linguistic issues. The filtering criteria based upon information on inventors' addresses are sensitive to data quality, while those based upon information on co-inventorship networks are always effective. Details on data access and data quality improvement via feedback collection are also discussed.

Suggested Citation

  • Michele Pezzoni & Francesco Lissoni & Gianluca Tarasconi, 2014. "How to kill inventors: testing the Massacrator© algorithm for inventor disambiguation," Post-Print halshs-01074536, HAL.
  • Handle: RePEc:hal:journl:halshs-01074536
    DOI: 10.1007/s11192-014-1375-7
    as

    Download full text from publisher

    To our knowledge, this item is not available for download. To find whether it is available, there are three options:
    1. Check below whether another version of this item is available online.
    2. Check on the provider's web page whether it is in fact available.
    3. Perform a search for a similarly titled item that would be available.

    Other versions of this item:

    References listed on IDEAS

    as
    1. Francesco Lissoni & Patrick Llerena & Maureen McKelvey & Bulat Sanditov, 2008. "Academic patenting in Europe: new evidence from the KEINS database," Research Evaluation, Oxford University Press, vol. 17(2), pages 87-102, June.
    2. Manuel Trajtenberg & Gil Shiff & Ran Melamed, 2009. "The "Names Game": Harnessing Inventors, Patent Data for Economic Research," Annals of Economics and Statistics, GENES, issue 93-94, pages 67-77.
    3. repec:fth:harver:1473 is not listed on IDEAS
    4. Gerald Marschke, 2006. "The influence of university research on industrial innovation," Proceedings, Federal Reserve Bank of Cleveland.
    5. Raffo, Julio & Lhuillery, Stéphane, 2009. "How to play the "Names Game": Patent retrieval comparing different heuristics," Research Policy, Elsevier, vol. 38(10), pages 1617-1627, December.
    6. Grid Thoma & Salvatore Torrisi & Alfonso Gambardella & Dominique Guellec & Bronwyn H. Hall & Dietmar Harhoff, 2010. "Harmonizing and Combining Large Datasets - An Application to Firm-Level Patent and Accounting Data," NBER Working Papers 15851, National Bureau of Economic Research, Inc.
    7. Stefano Breschi & Francesco Lissoni & Gianluca Tarasconi, 2014. "Inventor Data for Research on Migration and Innovation: A Survey and a Pilot," WIPO Economic Research Working Papers 17, World Intellectual Property Organization - Economics and Statistics Division.
    8. Li Tang & John P. Walsh, 2010. "Bibliometric fingerprints: name disambiguation based on approximate structure equivalence of cognitive maps," Scientometrics, Springer;Akadémiai Kiadó, vol. 84(3), pages 763-784, September.
    9. Nicolas CARAYOL & Lorenzo CASSI, 2009. "Who\'s Who in Patents. A Bayesian approach," Cahiers du GREThA (2007-2019) 2009-07, Groupe de Recherche en Economie Théorique et Appliquée (GREThA).
    10. Alexandre BERTHE & Sylvie FERRARI, 2012. "Ecological inequalities: how to link unequal access to the environment with theories of justice?," Cahiers du GREThA (2007-2019) 2012-17, Groupe de Recherche en Economie Théorique et Appliquée (GREThA).
    11. Pierre Azoulay & Waverly Ding & Toby Stuart, 2009. "The Impact Of Academic Patenting On The Rate, Quality And Direction Of (Public) Research Output," Journal of Industrial Economics, Wiley Blackwell, vol. 57(4), pages 637-676, December.
    12. Grid Thoma & Salvatore Torrisi, 2007. "Creating Powerful Indicators for Innovation Studies with Approximate Matching Algorithms. A test based on PATSTAT and Amadeus databases," KITeS Working Papers 211, KITeS, Centre for Knowledge, Internationalization and Technology Studies, Universita' Bocconi, Milano, Italy, revised Dec 2007.
    13. Stefano Breschi & Francesco Lissoni & Fabio Montobbio, 2006. "University patenting and scientific productivity. A quantitative study of Italian academic inventors," KITeS Working Papers 189, KITeS, Centre for Knowledge, Internationalization and Technology Studies, Universita' Bocconi, Milano, Italy, revised Nov 2006.
    14. Francesco Lissoni & Patrick Llerena & Bulat Sanditov, 2011. "Small Worlds in Networks of Inventors and the Role of Science: An Analysis of France," Working Papers of BETA 2011-18, Bureau d'Economie Théorique et Appliquée, UDS, Strasbourg.
    15. Ernest Miguelez & Carsten Fink, 2013. "Measuring the International Mobility of Inventors: A New Database," WIPO Economic Research Working Papers 08, World Intellectual Property Organization - Economics and Statistics Division, revised May 2013.
    16. Zvi Griliches, 1998. "Patent Statistics as Economic Indicators: A Survey," NBER Chapters, in: R&D and Productivity: The Econometric Evidence, pages 287-343, National Bureau of Economic Research, Inc.
    17. Li, Guan-Cheng & Lai, Ronald & D’Amour, Alexander & Doolin, David M. & Sun, Ye & Torvik, Vetle I. & Yu, Amy Z. & Fleming, Lee, 2014. "Disambiguation and co-authorship networks of the U.S. patent inventor database (1975–2010)," Research Policy, Elsevier, vol. 43(6), pages 941-955.
    18. Matt Marx & Deborah Strumsky & Lee Fleming, 2009. "Mobility, Skills, and the Michigan Non-Compete Experiment," Management Science, INFORMS, vol. 55(6), pages 875-889, June.
    19. repec:wip:wpaper:8 is not listed on IDEAS
    20. Nagaoka, Sadao & Motohashi, Kazuyuki & Goto, Akira, 2010. "Patent Statistics as an Innovation Indicator," Handbook of the Economics of Innovation, in: Bronwyn H. Hall & Nathan Rosenberg (ed.), Handbook of the Economics of Innovation, edition 1, volume 2, chapter 0, pages 1083-1127, Elsevier.
    21. Francesco Lissoni & Bulat Sanditov & Gianluca Tarasconi, 2006. "The Keins Database on Academic Inventors: Methodology and Contents," KITeS Working Papers 181, KITeS, Centre for Knowledge, Internationalization and Technology Studies, Universita' Bocconi, Milano, Italy, revised Sep 2006.
    22. Francesco Lissoni & Michele Pezzoni & Bianca Poti` & Sandra Romagnosi, 2013. "University Autonomy, the Professor Privilege and Academic Patenting: Italy, 1996--2007," Industry and Innovation, Taylor & Francis Journals, vol. 20(5), pages 399-421, July.
    23. Lee Fleming & Charles King & Adam I. Juda, 2007. "Small Worlds and Regional Innovation," Organization Science, INFORMS, vol. 18(6), pages 938-954, December.
    24. Stefano Breschi & Francesco Lissoni, 2009. "Mobility of skilled workers and co-invention networks: an anatomy of localized knowledge flows," Journal of Economic Geography, Oxford University Press, vol. 9(4), pages 439-468, July.
    25. Vetle I. Torvik & Marc Weeber & Don R. Swanson & Neil R. Smalheiser, 2005. "A probabilistic similarity metric for Medline records: A model for author name disambiguation," Journal of the American Society for Information Science and Technology, Association for Information Science & Technology, vol. 56(2), pages 140-158, January.
    26. Balconi, Margherita & Breschi, Stefano & Lissoni, Francesco, 2004. "Networks of inventors and the role of academia: an exploration of Italian patent data," Research Policy, Elsevier, vol. 33(1), pages 127-145, January.
    Full references (including those not matched with items on IDEAS)

    Most related items

    These are the items that most often cite the same works as this one and are cited by the same works as this one.
    1. Massimiliano Ferrara & Roberto Mavilia & Bruno Antonio Pansera, 2017. "Extracting knowledge patterns with a social network analysis approach: an alternative methodology for assessing the impact of power inventors," Scientometrics, Springer;Akadémiai Kiadó, vol. 113(3), pages 1593-1625, December.
    2. Ventura, Samuel L. & Nugent, Rebecca & Fuchs, Erica R.H., 2015. "Seeing the non-stars: (Some) sources of bias in past disambiguation approaches and a new public tool leveraging labeled records," Research Policy, Elsevier, vol. 44(9), pages 1672-1701.
    3. Li, Guan-Cheng & Lai, Ronald & D’Amour, Alexander & Doolin, David M. & Sun, Ye & Torvik, Vetle I. & Yu, Amy Z. & Fleming, Lee, 2014. "Disambiguation and co-authorship networks of the U.S. patent inventor database (1975–2010)," Research Policy, Elsevier, vol. 43(6), pages 941-955.
    4. Deyun Yin & Kazuyuki Motohashi & Jianwei Dang, 2020. "Large-scale name disambiguation of Chinese patent inventors (1985–2016)," Scientometrics, Springer;Akadémiai Kiadó, vol. 122(2), pages 765-790, February.
    5. Francesco Lissoni & Michele Pezzoni & Bianca Potì & Sandra Romagnosi, 2012. "University autonomy, IP legislation and academic patenting: Italy, 1996-2007," Post-Print hal-00779750, HAL.
    6. Miguelez, Ernest, 2019. "Collaborative patents and the mobility of knowledge workers," Technovation, Elsevier, vol. 86, pages 62-74.
    7. Ernest Miguélez & Rosina Moreno & Jordi Suriñach, 2010. "Inventors on the move: Tracing inventors' mobility and its spatial distribution," Papers in Regional Science, Wiley Blackwell, vol. 89(2), pages 251-274, June.
    8. Ernest Miguélez & Ismael Gómez-Miguélez, 2011. "“Singling out individual inventors from patent data”," IREA Working Papers 201105, University of Barcelona, Research Institute of Applied Economics, revised May 2011.
    9. Malwina Mejer, 2011. "Entrepreneurial Scientists and their Publication Performance. An Insight from Belgium," Working Papers ECARES ECARES 2011-017, ULB -- Universite Libre de Bruxelles.
    10. Carayol, Nicolas & Bergé, Laurent & Cassi, Lorenzo & Roux, Pascale, 2019. "Unintended triadic closure in social networks: The strategic formation of research collaborations between French inventors," Journal of Economic Behavior & Organization, Elsevier, vol. 163(C), pages 218-238.
    11. Zi‐Lin He & Tony W. Tong & Yuchen Zhang & Wenlong He, 2018. "Constructing a Chinese Patent Database of listed firms in China: Descriptions, lessons, and insights," Journal of Economics & Management Strategy, Wiley Blackwell, vol. 27(3), pages 579-606, September.
    12. Cristelli, Gabriele & Lissoni, Francesco, 2020. "Free movement of inventors: open-border policy and innovation in Switzerland," MPRA Paper 107433, University Library of Munich, Germany.
    13. Stefano Breschi & Francesco Lissoni & Ernest Miguelez, 2017. "Foreign-origin inventors in the USA: testing for diaspora and brain gain effects," Journal of Economic Geography, Oxford University Press, vol. 17(5), pages 1009-1038.
    14. Malwina Mejer, 2012. "Academic Patenting in Belgium:Methodology and Evidence," Working Papers TIMES² 2013-003, ULB -- Universite Libre de Bruxelles.
    15. Charlotta Dahlborg & Danielle Lewensohn & Rickard Danell & Carl Johan Sundberg, 2017. "To invent and let others innovate: a framework of academic patent transfer modes," The Journal of Technology Transfer, Springer, vol. 42(3), pages 538-563, June.
    16. Pellegrino, Gabriele & Penner, Orion & Piguet, Etienne & de Rassenfosse, Gaétan, 2023. "Productivity gains from migration: Evidence from inventors," Research Policy, Elsevier, vol. 52(1).
    17. Lissoni, Francesco, 2010. "Academic inventors as brokers," Research Policy, Elsevier, vol. 39(7), pages 843-857, September.
    18. Diego Useche & Ernest Miguelez & Francesco Lissoni, 2020. "Highly skilled and well connected: Migrant inventors in cross-border M&As," Journal of International Business Studies, Palgrave Macmillan;Academy of International Business, vol. 51(5), pages 737-763, July.
    19. Stefano Breschi & Francesco Lissoni & Gianluca Tarasconi, 2014. "Inventor Data for Research on Migration and Innovation: A Survey and a Pilot," WIPO Economic Research Working Papers 17, World Intellectual Property Organization - Economics and Statistics Division.
    20. Francesco Lissoni, 2013. "Intellectual property and university–industry technology transfer," Chapters, in: Faïz Gallouj & Luis Rubalcaba & Paul Windrum (ed.), Public–Private Innovation Networks in Services, chapter 7, pages 164-194, Edward Elgar Publishing.

    More about this item

    Keywords

    Patent data Inventors Name disambiguation;

    JEL classification:

    • C15 - Mathematical and Quantitative Methods - - Econometric and Statistical Methods and Methodology: General - - - Statistical Simulation Methods: General
    • C81 - Mathematical and Quantitative Methods - - Data Collection and Data Estimation Methodology; Computer Programs - - - Methodology for Collecting, Estimating, and Organizing Microeconomic Data; Data Access
    • O34 - Economic Development, Innovation, Technological Change, and Growth - - Innovation; Research and Development; Technological Change; Intellectual Property Rights - - - Intellectual Property and Intellectual Capital

    Statistics

    Access and download statistics

    Corrections

    All material on this site has been provided by the respective publishers and authors. You can help correct errors and omissions. When requesting a correction, please mention this item's handle: RePEc:hal:journl:halshs-01074536. See general information about how to correct material in RePEc.

    If you have authored this item and are not yet registered with RePEc, we encourage you to do it here. This allows to link your profile to this item. It also allows you to accept potential citations to this item that we are uncertain about.

    If CitEc recognized a bibliographic reference but did not link an item in RePEc to it, you can help with this form .

    If you know of missing items citing this one, you can help us creating those links by adding the relevant references in the same way as above, for each refering item. If you are a registered author of this item, you may also want to check the "citations" tab in your RePEc Author Service profile, as there may be some citations waiting for confirmation.

    For technical questions regarding this item, or to correct its authors, title, abstract, bibliographic or download information, contact: CCSD (email available below). General contact details of provider: https://hal.archives-ouvertes.fr/ .

    Please note that corrections may take a couple of weeks to filter through the various RePEc services.

    IDEAS is a RePEc service. RePEc uses bibliographic data supplied by the respective publishers.