IDEAS home Printed from https://ideas.repec.org/p/cen/wpaper/12-13.html
   My bibliography  Save this paper

Dynamically Consistent Noise Infusion and Partially Synthetic Data as Confidentiality Protection Measures for Related Time Series

Author

Listed:
  • John M. Abowd
  • Kaj Gittings
  • Kevin L. McKinney
  • Bryce E. Stephens
  • Lars Vilhuber
  • Simon Woodcock

Abstract

The Census Bureau's Quarterly Workforce Indicators (QWI) provide detailed quarterly statistics on employment measures such as worker and job flows, tabulated by worker characteristics in various combinations. The data are released for several levels of NAICS industries and geography, the lowest aggregation of the latter being counties. Disclosure avoidance methods are required to protect the information about individuals and businesses that contribute to the underlying data. The QWI disclosure avoidance mechanism we describe here relies heavily on the use of noise infusion through a permanent multiplicative noise distortion factor, used for magnitudes, counts, differences and ratios. There is minimal suppression and no complementary suppressions. To our knowledge, the release in 2003 of the QWI was the first large-scale use of noise infusion in any official statistical product. We show that the released statistics are analytically valid along several critical dimensions { measures are unbiased and time series properties are preserved. We provide an analysis of the degree to which confidentiality is protected. Furthermore, we show how the judicious use of synthetic data, injected into the tabulation process, can completely eliminate suppressions, maintain analytical validity, and increase the protection of the underlying confidential data.

Suggested Citation

  • John M. Abowd & Kaj Gittings & Kevin L. McKinney & Bryce E. Stephens & Lars Vilhuber & Simon Woodcock, 2012. "Dynamically Consistent Noise Infusion and Partially Synthetic Data as Confidentiality Protection Measures for Related Time Series," Working Papers 12-13, Center for Economic Studies, U.S. Census Bureau.
  • Handle: RePEc:cen:wpaper:12-13
    as

    Download full text from publisher

    File URL: https://www2.census.gov/ces/wp/2012/CES-WP-12-13.pdf
    File Function: First version, 2012
    Download Restriction: no
    ---><---

    References listed on IDEAS

    as
    1. John J. Abowd & John Haltiwanger & Julia Lane, 2004. "Integrated Longitudinal Employer-Employee Data for the United States," American Economic Review, American Economic Association, vol. 94(2), pages 224-229, May.
    2. John M. Abowd & Bryce E. Stephens & Lars Vilhuber & Fredrik Andersson & Kevin L. McKinney & Marc Roemer & Simon Woodcock, 2009. "The LEHD Infrastructure Files and the Creation of the Quarterly Workforce Indicators," NBER Chapters, in: Producer Dynamics: New Evidence from Micro Data, pages 149-230, National Bureau of Economic Research, Inc.
    3. John M. Abowd & Julia I. Lane, 2004. "New Approaches to Confidentiality Protection Synthetic Data, Remote Access and Research Data Centers," Longitudinal Employer-Household Dynamics Technical Papers 2004-03, Center for Economic Studies, U.S. Census Bureau.
    4. Timothy Dunne & J. Bradford Jensen & Mark J. Roberts, 2009. "Producer Dynamics: New Evidence from Micro Data," NBER Books, National Bureau of Economic Research, Inc, number dunn05-1, March.
    Full references (including those not matched with items on IDEAS)

    Citations

    Citations are extracted by the CitEc Project, subscribe to its RSS feed for this item.
    as


    Cited by:

    1. Kevin L. McKinney & Andrew S. Green & Lars Vilhuber & John M. Abowd, 2020. "Total Error and Variability Measures for the Quarterly Workforce Indicators and LEHD Origin Destination Employment Statistics in OnTheMap," Working Papers 20-30, Center for Economic Studies, U.S. Census Bureau.
    2. Miranda, Javier & Lars Vilhuber, 2014. "Looking Back On Three Years Of Using The Synthetic Lbd Beta," Working Papers 14-11, Center for Economic Studies, U.S. Census Bureau.
    3. Kevin L. McKinney & Andrew S. Green & Lars Vilhuber & John M. Abowd, 2017. "Total Error and Variability Measures with Integrated Disclosure Limitation for Quarterly Workforce Indicators and LEHD Origin Destination Employment Statistics in On The Map," Working Papers 17-71, Center for Economic Studies, U.S. Census Bureau.
    4. John M. Abowd & Kevin L. McKinney, 2014. "Noise Infusion As A Confidentiality Protection Measure For Graph-Based Statistics," Working Papers 14-30, Center for Economic Studies, U.S. Census Bureau.
    5. Ian Schmutte & Lars Vilhuber, 2022. "An Interview with John M. Abowd," International Statistical Review, International Statistical Institute, vol. 90(1), pages 1-40, April.
    6. John M. Abowd & Ian M. Schmutte & Lars Vilhuber, 2018. "Disclosure Limitation and Confidentiality Protection in Linked Data," Working Papers 18-07, Center for Economic Studies, U.S. Census Bureau.
    7. Javier Miranda & Lars Vilhuber, 2016. "Using Partially Synthetic Microdata to Protect Sensitive Cells in Business Statistics," Working Papers 16-10, Center for Economic Studies, U.S. Census Bureau.
    8. Thiemo Fetzer, 2014. "Fracking Growth," CEP Discussion Papers dp1278, Centre for Economic Performance, LSE.
    9. Piyush Anand & Clarence Lee, 2023. "Using Deep Learning to Overcome Privacy and Scalability Issues in Customer Data Transfer," Marketing Science, INFORMS, vol. 42(1), pages 189-207, January.
    10. Robert Manduca, 2018. "The US Census Longitudinal Employer-Household Dynamics Datasets," REGION, European Regional Science Association, vol. 5, pages 5-12.

    Most related items

    These are the items that most often cite the same works as this one and are cited by the same works as this one.
    1. Barth, Erling & Davis, James C. & Freeman, Richard B. & McElheran, Kristina, 2023. "Twisting the demand curve: Digitalization and the older workforce," Journal of Econometrics, Elsevier, vol. 233(2), pages 443-467.
    2. Fredrik Andersson & John C. Haltiwanger & Mark J. Kutzbach & Henry O. Pollakowski & Daniel H. Weinberg, 2018. "Job Displacement and the Duration of Joblessness: The Role of Spatial Mismatch," The Review of Economics and Statistics, MIT Press, vol. 100(2), pages 203-218, May.
    3. Woodcock, Simon D. & Benedetto, Gary, 2009. "Distribution-preserving statistical disclosure limitation," Computational Statistics & Data Analysis, Elsevier, vol. 53(12), pages 4228-4242, October.
    4. Andersson, Fredrik W. & Burgess, Simon & Lane, Julia, 2009. "Do as the Neighbors Do: The Impact of Social Networks on Immigrant Employment," IZA Discussion Papers 4423, Institute of Labor Economics (IZA).
    5. Jahangir Alam M. & Dostie Benoit & Drechsler Jörg & Vilhuber Lars, 2020. "Applying data synthesis for longitudinal business data across three countries," Statistics in Transition New Series, Polish Statistical Association, vol. 21(4), pages 212-236, August.
    6. Henry Hyatt & Erika McEntarfer, 2012. "Job-to-Job Flows and the Business Cycle," Working Papers 12-04, Center for Economic Studies, U.S. Census Bureau.
    7. Abowd, John M. & Vilhuber, Lars, 2011. "National estimates of gross employment and job flows from the Quarterly Workforce Indicators with demographic and industry detail," Journal of Econometrics, Elsevier, vol. 161(1), pages 82-99, March.
    8. John M. Abowd & Ian M. Schmutte & Lars Vilhuber, 2018. "Disclosure Limitation and Confidentiality Protection in Linked Data," Working Papers 18-07, Center for Economic Studies, U.S. Census Bureau.
    9. Kevin L. McKinney & John M. Abowd & John Sabelhaus, 2021. "United States Earnings Dynamics: Inequality, Mobility, and Volatility," NBER Chapters, in: Measuring Distribution and Mobility of Income and Wealth, pages 69-104, National Bureau of Economic Research, Inc.
    10. John R. Graham & Hyunseob Kim & Si Li & Jiaping Qiu, 2019. "Employee Costs of Corporate Bankruptcy," NBER Working Papers 25922, National Bureau of Economic Research, Inc.
    11. Joyce K. Hahn & Henry R. Hyatt & Hubert P. Janicki & Stephen R. Tibbets, 2017. "Job-to-Job Flows and Earnings Growth," American Economic Review, American Economic Association, vol. 107(5), pages 358-363, May.
    12. Matthew R. Graham & Mark J. Kutzbach & Danielle H. Sandler, 2017. "Developing a Residence Candidate File for Use With Employer-Employee Matched Data," Working Papers 17-40, Center for Economic Studies, U.S. Census Bureau.
    13. Christopher Goetz & Henry Hyatt & Erika McEntarfer & Kristin Sandusky, 2016. "The Promise and Potential of Linked Employer-Employee Data for Entrepreneurship Research," NBER Chapters, in: Measuring Entrepreneurial Businesses: Current Knowledge and Challenges, pages 433-462, National Bureau of Economic Research, Inc.
    14. Nicholas Bloom & Scott Ohlmacher & Cristina Tello-Trillo & Melanie Wallskog, 2021. "Pay, Productivity and Management," Working Papers 21-31, Center for Economic Studies, U.S. Census Bureau.
    15. Shigeru Fujita & Giuseppe Moscarini, 2017. "Recall and Unemployment," American Economic Review, American Economic Association, vol. 107(12), pages 3875-3916, December.
    16. David Blau & Tetyana Shvydko, 2011. "Labor Market Rigidities and the Employment Behavior of Older Workers," ILR Review, Cornell University, ILR School, vol. 64(3), pages 464-484, April.
    17. John Haltiwanger & Henry Hyatt & Erika McEntarfer, 2015. "Cyclical Reallocation of Workers Across Employers by Firm Size and Firm Wage," NBER Working Papers 21235, National Bureau of Economic Research, Inc.
    18. Ian M. Schmutte, 2015. "Job Referral Networks and the Determination of Earnings in Local Labor Markets," Journal of Labor Economics, University of Chicago Press, vol. 33(1), pages 1-32.
    19. E. Mark Curtis & Barry T. Hirsch & Mary C. Schroeder, 2016. "Evaluating Workplace Mandates with Flows Versus Stocks: An Application to California Paid Family Leave," Southern Economic Journal, John Wiley & Sons, vol. 83(2), pages 501-526, October.
    20. Hellerstein, Judith K. & Kutzbach, Mark J. & Neumark, David, 2014. "Do labor market networks have an important spatial dimension?," Journal of Urban Economics, Elsevier, vol. 79(C), pages 39-58.

    More about this item

    Keywords

    noise infusion; synthetic data; statistical disclosure limitation; time-series; local labor markets; gross job flows; gross worker flows; confidentiality protection;
    All these keywords.

    JEL classification:

    • C82 - Mathematical and Quantitative Methods - - Data Collection and Data Estimation Methodology; Computer Programs - - - Methodology for Collecting, Estimating, and Organizing Macroeconomic Data; Data Access
    • J21 - Labor and Demographic Economics - - Demand and Supply of Labor - - - Labor Force and Employment, Size, and Structure
    • J23 - Labor and Demographic Economics - - Demand and Supply of Labor - - - Labor Demand
    • J40 - Labor and Demographic Economics - - Particular Labor Markets - - - General

    Statistics

    Access and download statistics

    Corrections

    All material on this site has been provided by the respective publishers and authors. You can help correct errors and omissions. When requesting a correction, please mention this item's handle: RePEc:cen:wpaper:12-13. See general information about how to correct material in RePEc.

    If you have authored this item and are not yet registered with RePEc, we encourage you to do it here. This allows to link your profile to this item. It also allows you to accept potential citations to this item that we are uncertain about.

    If CitEc recognized a bibliographic reference but did not link an item in RePEc to it, you can help with this form .

    If you know of missing items citing this one, you can help us creating those links by adding the relevant references in the same way as above, for each refering item. If you are a registered author of this item, you may also want to check the "citations" tab in your RePEc Author Service profile, as there may be some citations waiting for confirmation.

    For technical questions regarding this item, or to correct its authors, title, abstract, bibliographic or download information, contact: Dawn Anderson (email available below). General contact details of provider: https://edirc.repec.org/data/cesgvus.html .

    Please note that corrections may take a couple of weeks to filter through the various RePEc services.

    IDEAS is a RePEc service. RePEc uses bibliographic data supplied by the respective publishers.