IDEAS home Printed from https://ideas.repec.org/p/nbr/nberwo/25626.html
   My bibliography  Save this paper

A Practical Method to Reduce Privacy Loss when Disclosing Statistics Based on Small Samples

Author

Listed:
  • Raj Chetty
  • John N. Friedman

Abstract

We develop a simple method to reduce privacy loss when disclosing statistics such as OLS regression estimates based on samples with small numbers of observations. We focus on the case where the dataset can be broken into many groups (“cells”) and one is interested in releasing statistics for one or more of these cells. Building on ideas from the differential privacy literature, we add noise to the statistic of interest in proportion to the statistic's maximum observed sensitivity, defined as the maximum change in the statistic from adding or removing a single observation across all the cells in the data. Intuitively, our approach permits the release of statistics in arbitrarily small samples by adding sufficient noise to the estimates to protect privacy. Although our method does not offer a formal privacy guarantee, it generally outperforms widely used methods of disclosure limitation such as count-based cell suppression both in terms of privacy loss and statistical bias. We illustrate how the method can be implemented by discussing how it was used to release estimates of social mobility by Census tract in the Opportunity Atlas. We also provide a step-by-step guide and illustrative Stata code to implement our approach.

Suggested Citation

  • Raj Chetty & John N. Friedman, 2019. "A Practical Method to Reduce Privacy Loss when Disclosing Statistics Based on Small Samples," NBER Working Papers 25626, National Bureau of Economic Research, Inc.
  • Handle: RePEc:nbr:nberwo:25626
    Note: CH LS PE
    as

    Download full text from publisher

    File URL: http://www.nber.org/papers/w25626.pdf
    Download Restriction: no
    ---><---

    Other versions of this item:

    References listed on IDEAS

    as
    1. Joshua D. Angrist & Parag A. Pathak & Christopher R. Walters, 2013. "Explaining Charter School Effectiveness," American Economic Journal: Applied Economics, American Economic Association, vol. 5(4), pages 1-27, October.
    2. John M. Abowd & Ian M. Schmutte, 2019. "An Economic Analysis of Privacy Protection and Statistical Accuracy as Social Choices," American Economic Review, American Economic Association, vol. 109(1), pages 171-202, January.
    3. John M. Abowd & Ian M. Schmutte, 2015. "Economic Analysis and Statistical Disclosure Limitation," Brookings Papers on Economic Activity, Economic Studies Program, The Brookings Institution, vol. 50(1 (Spring), pages 221-293.
    4. John M. Abowd & Ian M. Schmutte, 2015. "Economic Analysis and Statistical Disclosure Limitation," Brookings Papers on Economic Activity, Economic Studies Program, The Brookings Institution, vol. 46(1 (Spring), pages 221-293.
    5. J. Trent Alexander & Michael Davern & Betsey Stevenson, 2010. "Inaccurate age and sex data in the Census PUMS files: Evidence and Implications," NBER Working Papers 15703, National Bureau of Economic Research, Inc.
    6. Wasserman, Larry & Zhou, Shuheng, 2010. "A Statistical Framework for Differential Privacy," Journal of the American Statistical Association, American Statistical Association, vol. 105(489), pages 375-389.
    7. Raj Chetty & John N. Friedman & Nathaniel Hendren & Maggie R. Jones & Sonya R. Porter, 2018. "The Opportunity Atlas: Mapping the Childhood Roots of Social Mobility," Working Papers 18-42, Center for Economic Studies, U.S. Census Bureau.
    Full references (including those not matched with items on IDEAS)

    Citations

    Citations are extracted by the CitEc Project, subscribe to its RSS feed for this item.
    as


    Cited by:

    1. Vilhuber, Lars, 2023. "Reproducibility and transparency versus privacy and confidentiality: Reflections from a data editor," Journal of Econometrics, Elsevier, vol. 235(2), pages 2285-2294.
    2. Ron S. Jarmin & John M. Abowd & Robert Ashmead & Ryan Cumings-Menon & Nathan Goldschlag & Michael B. Hawes & Sallie Ann Keller & Daniel Kifer & Philip Leclerc & Jerome P. Reiter & Rolando A. Rodrígue, 2023. "An in-depth examination of requirements for disclosure risk assessment," Proceedings of the National Academy of Sciences, Proceedings of the National Academy of Sciences, vol. 120(43), pages 2220558120-, October.
    3. Michler, Jeffrey D. & Josephson, Anna & Kilic, Talip & Murray, Siobhan, 2022. "Privacy protection, measurement error, and the integration of remote sensing and socioeconomic survey data," Journal of Development Economics, Elsevier, vol. 158(C).
    4. Dionissi Aliprantis & Hal Martin, 2020. "Neighborhood Sorting Obscures Neighborhood Effects in the Opportunity Atlas," Working Papers 20-37, Federal Reserve Bank of Cleveland.
    5. Atheendar S Venkataramani & Rourke O’Brien & Gregory L Whitehorn & Alexander C Tsai, 2020. "Economic influences on population health in the United States: Toward policymaking driven by data and evidence," PLOS Medicine, Public Library of Science, vol. 17(9), pages 1-17, September.
    6. Ian M. Schmutte & Nathan Yoder, 2022. "Information Design for Differential Privacy," Papers 2202.05452, arXiv.org, revised Dec 2022.
    7. Craig Wesley Carpenter & Anders Van Sandt & Scott Loveridge, 2022. "Measurement error in US regional economic data," Journal of Regional Science, Wiley Blackwell, vol. 62(1), pages 57-80, January.
    8. Braathen, Christian & Thorsen, Inge & Ubøe, Jan, 2022. "Adjusting for Cell Suppression in Commuting Trip Data," Discussion Papers 2022/13, Norwegian School of Economics, Department of Business and Management Science.

    Most related items

    These are the items that most often cite the same works as this one and are cited by the same works as this one.
    1. John M. Abowd & Robert Ashmead & Ryan Cumings-Menon & Simson Garfinkel & Micah Heineck & Christine Heiss & Robert Johns & Daniel Kifer & Philip Leclerc & Ashwin Machanavajjhala & Brett Moran & William, 2022. "The 2020 Census Disclosure Avoidance System TopDown Algorithm," Papers 2204.08986, arXiv.org.
    2. Braathen, Christian & Thorsen, Inge & Ubøe, Jan, 2022. "Adjusting for Cell Suppression in Commuting Trip Data," Discussion Papers 2022/13, Norwegian School of Economics, Department of Business and Management Science.
    3. Michler, Jeffrey D. & Josephson, Anna & Kilic, Talip & Murray, Siobhan, 2022. "Privacy protection, measurement error, and the integration of remote sensing and socioeconomic survey data," Journal of Development Economics, Elsevier, vol. 158(C).
    4. Vilhuber, Lars, 2023. "Reproducibility and transparency versus privacy and confidentiality: Reflections from a data editor," Journal of Econometrics, Elsevier, vol. 235(2), pages 2285-2294.
    5. Craig Wesley Carpenter & Anders Van Sandt & Scott Loveridge, 2022. "Measurement error in US regional economic data," Journal of Regional Science, Wiley Blackwell, vol. 62(1), pages 57-80, January.
    6. John M. Abowd & Ian M. Schmutte & William Sexton & Lars Vilhuber, 2019. "Suboptimal Provision of Privacy and Statistical Accuracy When They are Public Goods," Papers 1906.09353, arXiv.org.
    7. Raj Chetty & John N. Friedman & Nathaniel Hendren & Maggie R. Jones & Sonya R. Porter, 2018. "The Opportunity Atlas: Mapping the Childhood Roots of Social Mobility," Working Papers 18-42, Center for Economic Studies, U.S. Census Bureau.
    8. Ron S. Jarmin & John M. Abowd & Robert Ashmead & Ryan Cumings-Menon & Nathan Goldschlag & Michael B. Hawes & Sallie Ann Keller & Daniel Kifer & Philip Leclerc & Jerome P. Reiter & Rolando A. Rodrígue, 2023. "An in-depth examination of requirements for disclosure risk assessment," Proceedings of the National Academy of Sciences, Proceedings of the National Academy of Sciences, vol. 120(43), pages 2220558120-, October.
    9. Robertas Damasevicius, 2023. "Progress, Evolving Paradigms and Recent Trends in Economic Analysis," Financial Economics Letters, Anser Press, vol. 2(2), pages 35-47, October.
    10. John M Abowd & Michael B Hawes, 2022. "Confidentiality Protection in the 2020 US Census of Population and Housing," Papers 2206.03524, arXiv.org, revised Dec 2022.
    11. Jahangir Alam M. & Dostie Benoit & Drechsler Jörg & Vilhuber Lars, 2020. "Applying data synthesis for longitudinal business data across three countries," Statistics in Transition New Series, Polish Statistical Association, vol. 21(4), pages 212-236, August.
    12. Ryan Cumings-Menon, 2022. "Differentially Private Estimation via Statistical Depth," Papers 2207.12602, arXiv.org.
    13. John M. Abowd & Ian M. Schmutte & William N. Sexton & Lars Vilhuber, 2019. "Why the Economics Profession Must Actively Participate in the Privacy Protection Debate," AEA Papers and Proceedings, American Economic Association, vol. 109, pages 397-402, May.
    14. John M. Abowd & Ian M. Schmutte, 2017. "Revisiting the Economics of Privacy: Population Statistics and Confidentiality Protection as Public Goods," Working Papers 17-37, Center for Economic Studies, U.S. Census Bureau.
    15. Edoardo Ciscato & Alfred Galichon & Marion Goussé, 2020. "Like Attract Like? A Structural Comparison of Homogamy across Same-Sex and Different-Sex Households," Journal of Political Economy, University of Chicago Press, vol. 128(2), pages 740-781.
    16. Susan Dynarski & Daniel Hubbard & Brian Jacob & Silvia Robles, 2018. "Estimating the Effects of a Large For-Profit Charter School Operator," NBER Working Papers 24428, National Bureau of Economic Research, Inc.
    17. Song, Yang, 2019. "Sorting, school performance and quality: Evidence from China," Journal of Comparative Economics, Elsevier, vol. 47(1), pages 238-261.
    18. Peter Leopold S. Bergman & Eric W. Chan & Adam Kapor, 2020. "Housing Search Frictions: Evidence from Detailed Search Data and a Field Experiment," CESifo Working Paper Series 8080, CESifo.
    19. Shaun M. Dougherty, 2018. "The Effect of Career and Technical Education on Human Capital Accumulation: Causal Evidence from Massachusetts," Education Finance and Policy, MIT Press, vol. 13(2), pages 119-148, Spring.
    20. John Gathergood & Fabian Gunzinger & Benedict Guttman-Kenney & Edika Quispe-Torreblanca & Neil Stewart, 2020. "Levelling Down and the COVID-19 Lockdowns: Uneven Regional Recovery in UK Consumer Spending," Papers 2012.09336, arXiv.org, revised Dec 2020.

    More about this item

    JEL classification:

    • C0 - Mathematical and Quantitative Methods - - General
    • H0 - Public Economics - - General

    Statistics

    Access and download statistics

    Corrections

    All material on this site has been provided by the respective publishers and authors. You can help correct errors and omissions. When requesting a correction, please mention this item's handle: RePEc:nbr:nberwo:25626. See general information about how to correct material in RePEc.

    If you have authored this item and are not yet registered with RePEc, we encourage you to do it here. This allows to link your profile to this item. It also allows you to accept potential citations to this item that we are uncertain about.

    If CitEc recognized a bibliographic reference but did not link an item in RePEc to it, you can help with this form .

    If you know of missing items citing this one, you can help us creating those links by adding the relevant references in the same way as above, for each refering item. If you are a registered author of this item, you may also want to check the "citations" tab in your RePEc Author Service profile, as there may be some citations waiting for confirmation.

    For technical questions regarding this item, or to correct its authors, title, abstract, bibliographic or download information, contact: the person in charge (email available below). General contact details of provider: https://edirc.repec.org/data/nberrus.html .

    Please note that corrections may take a couple of weeks to filter through the various RePEc services.

    IDEAS is a RePEc service. RePEc uses bibliographic data supplied by the respective publishers.