Releasing multiply-imputed synthetic data generated in two stages to protect confidentiality
Abstract"To protect the cofidentiality of survey respondents' identities and sensitive attributes, statistical agencies can release data in which cofidential values are replaced with multiple imputations. These are called synthetic data. We propose a two-stage approach to generating synthetic data that enables agencies to release different numbers of imputations for different variables. Generation in two stages can reduce computational burdens, decrease disclosure risk, and increase inferential accuracy relative to generation in one stage. We present methods for obtaining inferences from such data. We describe the application of two stage synthesis to creating a public use file for a German business database." (Author's abstract, IAB-Doku) ((en))
Download InfoIf you experience problems downloading a file, check if you have the proper application to view it first. In case of further problems read the IDEAS help page. Note that these files are not on the IDEAS site. Please be patient as the files may be large.
Bibliographic InfoPaper provided by Institut für Arbeitsmarkt- und Berufsforschung (IAB), Nürnberg [Institute for Employment Research, Nuremberg, Germany] in its series IAB Discussion Paper with number 200720.
Length: 26 pages
Date of creation: 20 Jun 2007
Date of revision:
Publication status: published in: Statistica Sinica, Vol. 20, No. 1 (2010), p. 405-421
IAB-Betriebspanel; Datenaufbereitung; Datenanonymisierung; Datenschutz; angewandte Statistik; statistische Methode; Arbeitsmarktforschung; Imputationsverfahren;
This paper has been announced in the following NEP Reports:
- NEP-ALL-2007-07-13 (All new papers)
Please report citation or reference errors to , or , if you are the registered author of the cited work, log in to your RePEc Author Service profile, click on "citations" and make appropriate adjustments.:
- C. J. Skinner & M. J. Elliot, 2002. "A measure of disclosure risk for microdata," Journal of the Royal Statistical Society Series B, Royal Statistical Society, vol. 64(4), pages 855-867.
- John M. Abowd & Julia I. Lane, 2004. "New Approaches to Confidentiality Protection Synthetic Data, Remote Access and Research Data Centers," Longitudinal Employer-Household Dynamics Technical Papers 2004-03, Center for Economic Studies, U.S. Census Bureau.
- Karr, A.F. & Kohnen, C.N. & Oganian, A. & Reiter, J.P. & Sanil, A.P., 2006. "A Framework for Evaluating the Utility of Data Altered to Protect Confidentiality," The American Statistician, American Statistical Association, vol. 60, pages 224-232, August.
- Drechsler, Jörg & Dundler, Agnes & Bender, Stefan & Rässler, Susanne & Zwick, Thomas, 2007. "A new approach for disclosure control in the IAB Establishment Panel : multiple imputation for a better data access," IAB Discussion Paper 200711, Institut für Arbeitsmarkt- und Berufsforschung (IAB), Nürnberg [Institute for Employment Research, Nuremberg, Germany].
- Donald B. Rubin, 2003. "Nested multiple imputation of NMES via partially incompatible MCMC," Statistica Neerlandica, Netherlands Society for Statistics and Operations Research, vol. 57(1), pages 3-18.
- Jörg Höhne, 2008. "Anonymisierungsverfahren für Paneldaten," AStA Wirtschafts- und Sozialstatistisches Archiv, Springer, vol. 2(3), pages 259-275, October.
For technical questions regarding this item, or to correct its authors, title, abstract, bibliographic or download information, contact: (IAB, Geschäftsbereich Dokumentation und Bibliothek).
If references are entirely missing, you can add them using this form.