IDEAS home Printed from https://ideas.repec.org/p/nbr/nberwo/31995.html
   My bibliography  Save this paper

A Simulated Reconstruction and Reidentification Attack on the 2010 U.S. Census: Full Technical Report

Author

Listed:
  • John M. Abowd
  • Tamara Adams
  • Robert Ashmead
  • David Darais
  • Sourya Dey
  • Simson L. Garfinkel
  • Nathan Goldschlag
  • Daniel Kifer
  • Philip Leclerc
  • Ethan Lew
  • Scott Moore
  • Rolando A. Rodríguez
  • Ramy N. Tadros
  • Lars Vilhuber

Abstract

For the last half-century, it has been a common and accepted practice for statistical agencies, including the United States Census Bureau, to adopt different strategies to protect the confidentiality of aggregate tabular data products from those used to protect the individual records contained in publicly released microdata products. This strategy was premised on the assumption that the aggregation used to generate tabular data products made the resulting statistics inherently less disclosive than the microdata from which they were tabulated. Consistent with this common assumption, the 2010 Census of Population and Housing in the U.S. used different disclosure limitation rules for its tabular and microdata publications. This paper demonstrates that, in the context of disclosure limitation for the 2010 Census, the assumption that tabular data are inherently less disclosive than their underlying microdata is fundamentally flawed. The 2010 Census published more than 150 billion aggregate statistics in 180 table sets. Most of these tables were published at the most detailed geographic level—individual census blocks, which can have populations as small as one person. Using only 34 of the published table sets, we reconstructed microdata records including five variables (census block, sex, age, race, and ethnicity) from the confidential 2010 Census person records. Using only published data, an attacker using our methods can verify that all records in 70% of all census blocks (97 million people) are perfectly reconstructed. We further confirm, through reidentification studies, that an attacker can, within census blocks with perfect reconstruction accuracy, correctly infer the actual census response on race and ethnicity for 3.4 million vulnerable population uniques (persons with race and ethnicity different from the modal person on the census block) with 95% accuracy. Having shown the vulnerabilities inherent to the disclosure limitation methods used for the 2010 Census, we proceed to demonstrate that the more robust disclosure limitation framework used for the 2020 Census publications defends against attacks that are based on reconstruction. Finally, we show that available alternatives to the 2020 Census Disclosure Avoidance System would either fail to protect confidentiality, or would overly degrade the statistics’ utility for the primary statutory use case: redrawing the boundaries of all of the nation’s legislative and voting districts in compliance with the 1965 Voting Rights Act. You are reading the full technical report. For the summary paper see https://doi.org/10.1162/99608f92.4a1ebf70.

Suggested Citation

  • John M. Abowd & Tamara Adams & Robert Ashmead & David Darais & Sourya Dey & Simson L. Garfinkel & Nathan Goldschlag & Daniel Kifer & Philip Leclerc & Ethan Lew & Scott Moore & Rolando A. Rodríguez & R, 2023. "A Simulated Reconstruction and Reidentification Attack on the 2010 U.S. Census: Full Technical Report," NBER Working Papers 31995, National Bureau of Economic Research, Inc.
  • Handle: RePEc:nbr:nberwo:31995
    Note: LS
    as

    Download full text from publisher

    File URL: http://www.nber.org/papers/w31995.pdf
    Download Restriction: no
    ---><---

    More about this item

    JEL classification:

    • C19 - Mathematical and Quantitative Methods - - Econometric and Statistical Methods and Methodology: General - - - Other
    • J10 - Labor and Demographic Economics - - Demographic Economics - - - General

    Statistics

    Access and download statistics

    Corrections

    All material on this site has been provided by the respective publishers and authors. You can help correct errors and omissions. When requesting a correction, please mention this item's handle: RePEc:nbr:nberwo:31995. See general information about how to correct material in RePEc.

    If you have authored this item and are not yet registered with RePEc, we encourage you to do it here. This allows to link your profile to this item. It also allows you to accept potential citations to this item that we are uncertain about.

    We have no bibliographic references for this item. You can help adding them by using this form .

    If you know of missing items citing this one, you can help us creating those links by adding the relevant references in the same way as above, for each refering item. If you are a registered author of this item, you may also want to check the "citations" tab in your RePEc Author Service profile, as there may be some citations waiting for confirmation.

    For technical questions regarding this item, or to correct its authors, title, abstract, bibliographic or download information, contact: the person in charge (email available below). General contact details of provider: https://edirc.repec.org/data/nberrus.html .

    Please note that corrections may take a couple of weeks to filter through the various RePEc services.

    IDEAS is a RePEc service. RePEc uses bibliographic data supplied by the respective publishers.