IDEAS home Printed from https://ideas.repec.org/a/eee/stapro/v219y2025ics0167715224003237.html
   My bibliography  Save this article

Differentially private histogram with valid statistics

Author

Listed:
  • Cao, Zilong
  • Wu, Shisong
  • Li, Xuanang
  • Zhang, Hai

Abstract

Differentially private histograms (DP-Histograms) are integral to data publication and privacy preservation efforts. However, conventional DP-Histograms often fail to preserve valid statistical information and the essential characteristics of the original data. This paper shows the invalidity of variance is the inherent shortcomings in general DP-Histograms, and introduces a novel algorithm called the Differentially Private Histogram with Valid Statistics (VSDPH) to overcome the problem. The VSDPH, grounded in linear programming and bounded Lipschitz distance, efficiently generates DP histograms while preserving the valid statistics of the original data. Our theoretical analysis demonstrates that histograms produced by VSDPH maintain asymptotically valid variance, and we establish an upper bound based on the 1-Wasserstein distance. Through experiments, we validate that VSDPH can accurately hold the statistical characteristics of the original data. This capability brings the resulting histograms closer to the originals.

Suggested Citation

  • Cao, Zilong & Wu, Shisong & Li, Xuanang & Zhang, Hai, 2025. "Differentially private histogram with valid statistics," Statistics & Probability Letters, Elsevier, vol. 219(C).
  • Handle: RePEc:eee:stapro:v:219:y:2025:i:c:s0167715224003237
    DOI: 10.1016/j.spl.2024.110354
    as

    Download full text from publisher

    File URL: http://www.sciencedirect.com/science/article/pii/S0167715224003237
    Download Restriction: Full text for ScienceDirect subscribers only

    File URL: https://libkey.io/10.1016/j.spl.2024.110354?utm_source=ideas
    LibKey link: if access is restricted and if your library uses this service, LibKey will redirect you to where you can use your library subscription to access this item
    ---><---

    As the access to this document is restricted, you may want to search for a different version of it.

    References listed on IDEAS

    as
    1. Wasserman, Larry & Zhou, Shuheng, 2010. "A Statistical Framework for Differential Privacy," Journal of the American Statistical Association, American Statistical Association, vol. 105(489), pages 375-389.
    Full references (including those not matched with items on IDEAS)

    Most related items

    These are the items that most often cite the same works as this one and are cited by the same works as this one.
    1. John M. Abowd & Ian M. Schmutte & William Sexton & Lars Vilhuber, 2019. "Suboptimal Provision of Privacy and Statistical Accuracy When They are Public Goods," Papers 1906.09353, arXiv.org.
    2. Ron S. Jarmin & John M. Abowd & Robert Ashmead & Ryan Cumings-Menon & Nathan Goldschlag & Michael B. Hawes & Sallie Ann Keller & Daniel Kifer & Philip Leclerc & Jerome P. Reiter & Rolando A. Rodrígue, 2023. "An in-depth examination of requirements for disclosure risk assessment," Proceedings of the National Academy of Sciences, Proceedings of the National Academy of Sciences, vol. 120(43), pages 2220558120-, October.
    3. Raj Chetty & John N. Friedman, 2019. "A Practical Method to Reduce Privacy Loss When Disclosing Statistics Based on Small Samples," AEA Papers and Proceedings, American Economic Association, vol. 109, pages 414-420, May.
    4. John M. Abowd & Robert Ashmead & Ryan Cumings-Menon & Simson Garfinkel & Micah Heineck & Christine Heiss & Robert Johns & Daniel Kifer & Philip Leclerc & Ashwin Machanavajjhala & Brett Moran & William, 2022. "The 2020 Census Disclosure Avoidance System TopDown Algorithm," Papers 2204.08986, arXiv.org.
    5. Amorino, Chiara & Gloter, Arnaud & Halconruy, Hélène, 2025. "Evolving privacy: Drift parameter estimation for discretely observed i.i.d. diffusion processes under LDP," Stochastic Processes and their Applications, Elsevier, vol. 181(C).
    6. Ori Heffetz & Katrina Ligett, 2014. "Privacy and Data-Based Research," Journal of Economic Perspectives, American Economic Association, vol. 28(2), pages 75-98, Spring.
    7. Toth Daniell, 2014. "Data Smearing: An Approach to Disclosure Limitation for Tabular Data," Journal of Official Statistics, Sciendo, vol. 30(4), pages 839-857, December.
    8. Soumya Mukherjee & Aratrika Mustafi & Aleksandra Slavkovi'c & Lars Vilhuber, 2023. "Assessing Utility of Differential Privacy for RCTs," Papers 2309.14581, arXiv.org.
    9. Katherine B. Coffman & Lucas C. Coffman & Keith M. Marzilli Ericson, 2017. "The Size of the LGBT Population and the Magnitude of Antigay Sentiment Are Substantially Underestimated," Management Science, INFORMS, vol. 63(10), pages 3168-3186, October.
    10. Chongliang Luo & Md. Nazmul Islam & Natalie E. Sheils & John Buresh & Jenna Reps & Martijn J. Schuemie & Patrick B. Ryan & Mackenzie Edmondson & Rui Duan & Jiayi Tong & Arielle Marks-Anglin & Jiang Bi, 2022. "DLMM as a lossless one-shot algorithm for collaborative multi-site distributed linear mixed models," Nature Communications, Nature, vol. 13(1), pages 1-10, December.
    11. Lalanne, Clément & Gadat, Sébastien, 2024. "Privately Learning Smooth Distributions on the Hypercube by Projections," TSE Working Papers 24-1505, Toulouse School of Economics (TSE).
    12. Kwak, Seung Woo & Ahn, Jeongyoun & Lee, Jaewoo & Park, Cheolwoo, 2024. "Differentially Private Goodness-of-Fit Tests for Continuous Variables," Econometrics and Statistics, Elsevier, vol. 31(C), pages 81-99.
    13. Chang, Jinyuan & Hu, Qiao & Kolaczyk, Eric D. & Yao, Qiwei & Yi, Fengting, 2024. "Edge differentially private estimation in the β-model via jittering and method of moments," LSE Research Online Documents on Economics 122099, London School of Economics and Political Science, LSE Library.
    14. Claire McKay Bowen & Fang Liu & Bingyue Su, 2021. "Differentially private data release via statistical election to partition sequentially," METRON, Springer;Sapienza Università di Roma, vol. 79(1), pages 1-31, April.
    15. Jinshuo Dong & Aaron Roth & Weijie J. Su, 2022. "Gaussian differential privacy," Journal of the Royal Statistical Society Series B, Royal Statistical Society, vol. 84(1), pages 3-37, February.
    16. Jing Lei & Anne‐Sophie Charest & Aleksandra Slavkovic & Adam Smith & Stephen Fienberg, 2018. "Differentially private model selection with penalized and constrained likelihood," Journal of the Royal Statistical Society Series A, Royal Statistical Society, vol. 181(3), pages 609-633, June.
    17. Ryan Cumings-Menon, 2022. "Differentially Private Estimation via Statistical Depth," Papers 2207.12602, arXiv.org.
    18. Vishesh Karwa & Pavel N. Krivitsky & Aleksandra B. Slavković, 2017. "Sharing social network data: differentially private estimation of exponential family random-graph models," Journal of the Royal Statistical Society Series C, Royal Statistical Society, vol. 66(3), pages 481-500, April.
    19. Bi, Xuan & Shen, Xiaotong, 2023. "Distribution-invariant differential privacy," Journal of Econometrics, Elsevier, vol. 235(2), pages 444-453.
    20. Zhao, Wenbiao & Zhu, Xuehu & Zhu, Lixing, 2025. "Minimax rates of convergence for sliced inverse regression with differential privacy," Computational Statistics & Data Analysis, Elsevier, vol. 201(C).

    Corrections

    All material on this site has been provided by the respective publishers and authors. You can help correct errors and omissions. When requesting a correction, please mention this item's handle: RePEc:eee:stapro:v:219:y:2025:i:c:s0167715224003237. See general information about how to correct material in RePEc.

    If you have authored this item and are not yet registered with RePEc, we encourage you to do it here. This allows to link your profile to this item. It also allows you to accept potential citations to this item that we are uncertain about.

    If CitEc recognized a bibliographic reference but did not link an item in RePEc to it, you can help with this form .

    If you know of missing items citing this one, you can help us creating those links by adding the relevant references in the same way as above, for each refering item. If you are a registered author of this item, you may also want to check the "citations" tab in your RePEc Author Service profile, as there may be some citations waiting for confirmation.

    For technical questions regarding this item, or to correct its authors, title, abstract, bibliographic or download information, contact: Catherine Liu (email available below). General contact details of provider: http://www.elsevier.com/wps/find/journaldescription.cws_home/622892/description#description .

    Please note that corrections may take a couple of weeks to filter through the various RePEc services.

    IDEAS is a RePEc service. RePEc uses bibliographic data supplied by the respective publishers.