IDEAS home Printed from https://ideas.repec.org/a/bot/rivsta/v75y2015i1p19-35.html
   My bibliography  Save this article

Regression analysis with linked data: problems and possible solutions

Author

Listed:
  • Andrea Tancredi

    (Università di Roma "La Sapienza", Italy)

  • Brunero Liseo

    (Università di Roma "La Sapienza", Italy)

Abstract

In this paper we have described and extended some recent proposals on a general Bayesian methodology for performing record linkage and making inference using the resulting matched units. In particular, we have framed the record linkage process into a formal statistical model which comprises both the matching variables and the other variables included at the inferential stage. This way, the researcher is able to account for the matching process uncertainty in inferential procedures based on probabilistically linked data, and at the same time, he/she is also able to generate a feedback propagation of the information between the working statistical model and the record linkage stage. We have argued that this feedback effect is both essential to eliminate potential biases that otherwise would characterize the resulting linked data inference, and able to improve record linkage performances. The practical implementation of the procedure is based on the use of standard Bayesian computational techniques, such as Markov Chain Monte Carlo algorithms. Although the methodology is quite general, we have restricted our analysis to the popular and important case of multiple linear regression set-up for expository convenience.

Suggested Citation

  • Andrea Tancredi & Brunero Liseo, 2015. "Regression analysis with linked data: problems and possible solutions," Statistica, Department of Statistics, University of Bologna, vol. 75(1), pages 19-35.
  • Handle: RePEc:bot:rivsta:v:75:y:2015:i:1:p:19-35
    as

    Download full text from publisher

    To our knowledge, this item is not available for download. To find whether it is available, there are three options:
    1. Check below whether another version of this item is available online.
    2. Check on the provider's web page whether it is in fact available.
    3. Perform a search for a similarly titled item that would be available.

    Citations

    Citations are extracted by the CitEc Project, subscribe to its RSS feed for this item.
    as


    Cited by:

    1. John M. Abowd & Joelle Abramowitz & Margaret C. Levenstein & Kristin McCue & Dhiren Patki & Trivellore Raghunathan & Ann M. Rodgers & Matthew D. Shapiro & Nada Wasi, 2019. "Optimal Probabilistic Record Linkage: Best Practice for Linking Employers in Survey and Administrative Data," Working Papers 19-08, Center for Economic Studies, U.S. Census Bureau.
    2. John Cuffe & Nathan Goldschlag, 2018. "Squeezing More Out of Your Data: Business Record Linkage with Python," Working Papers 18-46, Center for Economic Studies, U.S. Census Bureau.

    Corrections

    All material on this site has been provided by the respective publishers and authors. You can help correct errors and omissions. When requesting a correction, please mention this item's handle: RePEc:bot:rivsta:v:75:y:2015:i:1:p:19-35. See general information about how to correct material in RePEc.

    If you have authored this item and are not yet registered with RePEc, we encourage you to do it here. This allows to link your profile to this item. It also allows you to accept potential citations to this item that we are uncertain about.

    We have no bibliographic references for this item. You can help adding them by using this form .

    If you know of missing items citing this one, you can help us creating those links by adding the relevant references in the same way as above, for each refering item. If you are a registered author of this item, you may also want to check the "citations" tab in your RePEc Author Service profile, as there may be some citations waiting for confirmation.

    For technical questions regarding this item, or to correct its authors, title, abstract, bibliographic or download information, contact: Giovanna Galatà (email available below). General contact details of provider: https://edirc.repec.org/data/dsbolit.html .

    Please note that corrections may take a couple of weeks to filter through the various RePEc services.

    IDEAS is a RePEc service. RePEc uses bibliographic data supplied by the respective publishers.