IDEAS home Printed from https://ideas.repec.org/p/zbw/iwhtrp/022015e.html
   My bibliography  Save this paper

RLPC: Record Linkage Pre-Cleaning – Technical Documentation of Routines

Author

Listed:
  • Ehrenfeld, Wilfried

Abstract

The primary objective of record linkage is the merger of different data sets on the basis of an unique identifier. The cases at hand are mostly company data sets from databanks with company characteristics (e.g. BvD Amadeus/Dafne), patent data sets (e.g. Patstat or DPMA) and funding data sets (e.g. BMBF funding catalog). These data sets shall be merged on the basis of the company names. Due to the fact that company names have varying notations in different databases - for example the corporate structure – a harmonization and standardization is necessary. The routines described here implement the record linkage pre-cleaning (RLPC). They are used to create record linkage compatible names (RLName) from given (actor) names (Name). This includes converting special characters to ASCII characters, identifying corporate structures, isolating and separating bracketed expressions. The result is an expression which allows for a comparison with other names. Following this pre-cleaning, record linkage systems can be used to merge several data sets that have been pretreated in the same way.

Suggested Citation

  • Ehrenfeld, Wilfried, 2015. "RLPC: Record Linkage Pre-Cleaning – Technical Documentation of Routines," IWH Technical Reports 02/2015e, Halle Institute for Economic Research (IWH).
  • Handle: RePEc:zbw:iwhtrp:022015e
    as

    Download full text from publisher

    File URL: https://www.econstor.eu/bitstream/10419/144719/1/ITR_2015-02e.pdf
    Download Restriction: no
    ---><---

    References listed on IDEAS

    as
    1. Ehrenfeld, Wilfried, 2015. "RegDemo: Preparation and Merger of Actor Data – Technical Documentation of Routines and Datasets," IWH Technical Reports 01/2015e, Halle Institute for Economic Research (IWH).
    2. Ehrenfeld, Wilfried, 2015. "Research Explorer – Technical Documentation of Routines," IWH Technical Reports 03/2015e, Halle Institute for Economic Research (IWH).
    Full references (including those not matched with items on IDEAS)

    Citations

    Citations are extracted by the CitEc Project, subscribe to its RSS feed for this item.
    as


    Cited by:

    1. Fritsch, Michael & Titze, Mirko & Piontek, Matthias, 2020. "Identifying cooperation for innovation―a comparison of data sources," EconStor Open Access Articles and Book Chapters, ZBW - Leibniz Information Centre for Economics, vol. 27(6), pages 630-659.
    2. Ehrenfeld, Wilfried, 2015. "RegDemo: Preparation and Merger of Actor Data – Technical Documentation of Routines and Datasets," IWH Technical Reports 01/2015e, Halle Institute for Economic Research (IWH).
    3. Michael Fritsch & Mirko Titze & Matthias Piontek, 2018. "Knowledge Interactions in Regional Innovation Networks: Comparing Data Sources," Jena Economics Research Papers 2018-003, Friedrich-Schiller-University Jena.
    4. Ehrenfeld, Wilfried, 2015. "RegDemo: Aufbereitung und Zusammenführung der Akteursdaten – Technische Dokumentation der Routinen und Datensätze," IWH Technical Reports 01/2015, Halle Institute for Economic Research (IWH).

    Most related items

    These are the items that most often cite the same works as this one and are cited by the same works as this one.
    1. Michael Fritsch & Mirko Titze & Matthias Piontek, 2020. "Identifying cooperation for innovation―a comparison of data sources," Industry and Innovation, Taylor & Francis Journals, vol. 27(6), pages 630-659, June.
    2. Ehrenfeld, Wilfried, 2015. "RegDemo: Preparation and Merger of Actor Data – Technical Documentation of Routines and Datasets," IWH Technical Reports 01/2015e, Halle Institute for Economic Research (IWH).
    3. Michael Fritsch & Mirko Titze & Matthias Piontek, 2018. "Knowledge Interactions in Regional Innovation Networks: Comparing Data Sources," Jena Economics Research Papers 2018-003, Friedrich-Schiller-University Jena.

    More about this item

    Statistics

    Access and download statistics

    Corrections

    All material on this site has been provided by the respective publishers and authors. You can help correct errors and omissions. When requesting a correction, please mention this item's handle: RePEc:zbw:iwhtrp:022015e. See general information about how to correct material in RePEc.

    If you have authored this item and are not yet registered with RePEc, we encourage you to do it here. This allows to link your profile to this item. It also allows you to accept potential citations to this item that we are uncertain about.

    If CitEc recognized a bibliographic reference but did not link an item in RePEc to it, you can help with this form .

    If you know of missing items citing this one, you can help us creating those links by adding the relevant references in the same way as above, for each refering item. If you are a registered author of this item, you may also want to check the "citations" tab in your RePEc Author Service profile, as there may be some citations waiting for confirmation.

    For technical questions regarding this item, or to correct its authors, title, abstract, bibliographic or download information, contact: ZBW - Leibniz Information Centre for Economics (email available below). General contact details of provider: https://edirc.repec.org/data/iwhhhde.html .

    Please note that corrections may take a couple of weeks to filter through the various RePEc services.

    IDEAS is a RePEc service. RePEc uses bibliographic data supplied by the respective publishers.