Linking household survey and administrative record data: what should the matching variables be?
Linkages of household survey responses with administrative data may be based on unique individual identifiers or on survey respondent characteristics. The benefits gained from using unique identifiers need to be assessed in the light of potential problems such as non-response and measurement error. We report on a study that linked survey responses to UK government agency records on benefits and tax credits in five different ways. One matched on a respondent-supplied National Insurance Number and the other four used different combinations of sex, name, address, and date of birth. As many linkages were made using matches on sex, date of birth, and post-code, or on sex, date of birth, first name and family name, as were made using matches on self-reported National Insurance Number, and the former were also relatively accurate when assessed in terms of false positive and false negative rates. The five independent matching exercises also shed light on the potential returns from hierarchical and pooled matching.
(This abstract was borrowed from another version of this item.)
|Date of creation:||01 Oct 2004|
|Date of revision:|
|Contact details of provider:|| Postal: |
Web page: https://www.iser.essex.ac.uk/
More information through EDIRC
|Order Information:|| Postal: Publications Office, Institute for Social and Economic Research, University of Essex, Wivenhoe Park, Colchester, Essex CO4 3SQ UK|
Web: https://www.iser.essex.ac.uk/publications/ Email:
References listed on IDEAS
Please report citation or reference errors to , or , if you are the registered author of the cited work, log in to your RePEc Author Service profile, click on "citations" and make appropriate adjustments.:
- Haider, S. & Solon, G., 2000.
"Nonrandom Selection in the HRS Social Security Earnings Sample,"
00-01, RAND - Labor and Population Program.
- Steven Haider & Gary Solon, 2000. "Non Random Selection in the HRS Social Security Earnings Sample," Working Papers 00-01, RAND Corporation Publications Department.
- repec:ese:iserwp:2000-38 is not listed on IDEAS
- Lorenzo Cappellari & Stephen P. Jenkins, 2003.
"Multivariate probit regression using simulated maximum likelihood,"
StataCorp LP, vol. 3(3), pages 278-294, September.
- Lorenzo Cappellari & Stephen P. Jenkins, 2003. "Multivariate probit regression using simulated maximum likelihood," United Kingdom Stata Users' Group Meetings 2003 10, Stata Users Group.
- Simon Burgess & Deborah Wilson, 2004.
"Ethnic Segretation in England's Schools,"
079, Centre for Analysis of Social Exclusion, LSE.
- Annette Jäckle & Emanuela Sala & Stephen P. Jenkins & Peter Lynn, 2005.
"Validation of Survey Data on Income and Employment: The ISMIE Experience,"
Discussion Papers of DIW Berlin
488, DIW Berlin, German Institute for Economic Research.
When requesting a correction, please mention this item's handle: RePEc:ese:iserwp:2004-23. See general information about how to correct material in RePEc.
For technical questions regarding this item, or to correct its authors, title, abstract, bibliographic or download information, contact: (Paul Groves)The email address of this maintainer does not seem to be valid anymore. Please ask Paul Groves to update the entry or send us the correct address
If references are entirely missing, you can add them using this form.