Towards Unrestricted Public Use Business Microdata: The Synthetic Longitudinal Business Database
In most countries, national statistical agencies do not release establishment-level business microdata, because doing so represents too large a risk to establishments\' confidentiality. One approach with the potential for overcoming these risks is to release synthetic data; that is, the released establishment data are simulated from statistical models designed to mimic the distributions of the underlying real microdata. In this article, we describe an application of this strategy to create a public use file for the Longitudinal Business Database, an annual economic census of establishments in the United States comprising more than 20 million records dating back to 1976. The U.S. Bureau of the Census and the Internal Revenue Service recently approved the release of these synthetic microdata for public use, making the synthetic Longitudinal Business Database the first-ever business microdata set publicly released in the United States. We describe how we created the synthetic data, evaluated analytical validity, and assessed disclosure risk.
|Date of creation:||Feb 2011|
|Date of revision:|
|Contact details of provider:|| Postal: |
Phone: (301) 763-6460
Fax: (301) 763-5935
Web page: http://www.census.gov/ces
More information through EDIRC
When requesting a correction, please mention this item's handle: RePEc:cen:wpaper:11-04. See general information about how to correct material in RePEc.
For technical questions regarding this item, or to correct its authors, title, abstract, bibliographic or download information, contact: (Fariha Kamal)
If references are entirely missing, you can add them using this form.