IDEAS home Printed from https://ideas.repec.org/p/cen/wpaper/11-04.html
   My bibliography  Save this paper

Towards Unrestricted Public Use Business Microdata: The Synthetic Longitudinal Business Database

Author

Listed:
  • Satkartar K. Kinney
  • Jerome P. Reiter
  • Arnold P. Reznek
  • Javier Miranda
  • Ron S. Jarmin
  • John M. Abowd

Abstract

In most countries, national statistical agencies do not release establishment-level business microdata, because doing so represents too large a risk to establishments\' confidentiality. One approach with the potential for overcoming these risks is to release synthetic data; that is, the released establishment data are simulated from statistical models designed to mimic the distributions of the underlying real microdata. In this article, we describe an application of this strategy to create a public use file for the Longitudinal Business Database, an annual economic census of establishments in the United States comprising more than 20 million records dating back to 1976. The U.S. Bureau of the Census and the Internal Revenue Service recently approved the release of these synthetic microdata for public use, making the synthetic Longitudinal Business Database the first-ever business microdata set publicly released in the United States. We describe how we created the synthetic data, evaluated analytical validity, and assessed disclosure risk.

Suggested Citation

  • Satkartar K. Kinney & Jerome P. Reiter & Arnold P. Reznek & Javier Miranda & Ron S. Jarmin & John M. Abowd, 2011. "Towards Unrestricted Public Use Business Microdata: The Synthetic Longitudinal Business Database," Working Papers 11-04, Center for Economic Studies, U.S. Census Bureau.
  • Handle: RePEc:cen:wpaper:11-04
    as

    Download full text from publisher

    File URL: https://www2.census.gov/ces/wp/2011/CES-WP-11-04.pdf
    File Function: First version, 2011
    Download Restriction: no

    Other versions of this item:

    Citations

    Citations are extracted by the CitEc Project, subscribe to its RSS feed for this item.
    as


    Cited by:

    1. Joseph W. Sakshaug & Trivellore E. Raghunathan, 2014. "Generating synthetic microdata to estimate small area statistics in the American Community Survey," Statistics in Transition new series, Główny Urząd Statystyczny (Polska), vol. 15(3), pages 341-368, June.
    2. Tatiana V. Komarova & Denis Nekipelov & Evgeny Yakovlev, 2011. "Identification, data combination and the risk of disclosure," CeMMAP working papers CWP38/11, Centre for Microdata Methods and Practice, Institute for Fiscal Studies.
    3. Ori Heffetz & Katrina Ligett, 2014. "Privacy and Data-Based Research," Journal of Economic Perspectives, American Economic Association, vol. 28(2), pages 75-98, Spring.
    4. John M. Abowd & Ian M. Schmutte, 2015. "Economic Analysis and Statistical Disclosure Limitation," Brookings Papers on Economic Activity, Economic Studies Program, The Brookings Institution, vol. 46(1 (Spring), pages 221-293.
    5. Little Roderick J., 2013. "Discussion," Journal of Official Statistics, De Gruyter Open, vol. 29(3), pages 363-366, June.
    6. Allen Tran, 2013. "Customer Driven Establishment Dynamics and Allocative Efficiency," 2013 Meeting Papers 115, Society for Economic Dynamics.
    7. Satkartar K. Kinney & Jerome P. Reiter & Javier Miranda, 2014. "Improving The Synthetic Longitudinal Business Database," Working Papers 14-12, Center for Economic Studies, U.S. Census Bureau.
    8. Klein, Martin & Sinha, Bimal, 2015. "Likelihood-based inference for singly and multiply imputed synthetic data under a normal model," Statistics & Probability Letters, Elsevier, vol. 105(C), pages 168-175.
    9. John M. Abowd & Ian M. Schmutte & Lars Vilhuber, 2018. "Disclosure Limitation and Confidentiality Protection in Linked Data," Working Papers 18-07, Center for Economic Studies, U.S. Census Bureau.
    10. Javier Miranda & Lars Vilhuber, 2016. "Using Partially Synthetic Microdata to Protect Sensitive Cells in Business Statistics," Working Papers 16-10, Center for Economic Studies, U.S. Census Bureau.

    More about this item

    NEP fields

    This paper has been announced in the following NEP Reports:

    Statistics

    Access and download statistics

    Corrections

    All material on this site has been provided by the respective publishers and authors. You can help correct errors and omissions. When requesting a correction, please mention this item's handle: RePEc:cen:wpaper:11-04. See general information about how to correct material in RePEc.

    For technical questions regarding this item, or to correct its authors, title, abstract, bibliographic or download information, contact: (Erica Coates). General contact details of provider: http://edirc.repec.org/data/cesgvus.html .

    If you have authored this item and are not yet registered with RePEc, we encourage you to do it here. This allows to link your profile to this item. It also allows you to accept potential citations to this item that we are uncertain about.

    We have no references for this item. You can help adding them by using this form .

    If you know of missing items citing this one, you can help us creating those links by adding the relevant references in the same way as above, for each refering item. If you are a registered author of this item, you may also want to check the "citations" tab in your RePEc Author Service profile, as there may be some citations waiting for confirmation.

    Please note that corrections may take a couple of weeks to filter through the various RePEc services.

    IDEAS is a RePEc service hosted by the Research Division of the Federal Reserve Bank of St. Louis . RePEc uses bibliographic data supplied by the respective publishers.