IDEAS home Printed from https://ideas.repec.org/a/bla/jorssa/v178y2015i4p963-975.html
   My bibliography  Save this article

A new method for protecting interrelated time series with Bayesian prior distributions and synthetic data

Author

Listed:
  • Matthew J. Schneider
  • John M. Abowd

Abstract

type="main" xml:id="rssa12100-abs-0001"> Organizations disseminate statistical summaries of administrative data via the Web for unrestricted public use. They balance the trade-off between protection of confidentiality and quality of inference. Recent developments in disclosure avoidance techniques include the incorporation of synthetic data, which capture the essential features of underlying data by releasing altered data generated from a posterior predictive distribution. The US Census Bureau collects millions of interrelated time series microdata that are hierarchical and contain many 0s and suppressions. Rule-based disclosure avoidance techniques often require the suppression of count data for small magnitudes and the modification of data based on a small number of entities. Motivated by this problem, we use zero-inflated extensions of Bayesian generalized linear mixed models with privacy-preserving prior distributions to develop methods for protecting and releasing synthetic data from time series about thousands of small groups of entities without suppression based on the magnitudes or number of entities. We find that, as the prior distributions of the variance components in the Bayesian generalized linear mixed model become more precise towards zero, protection of confidentiality increases and the quality of inference deteriorates. We evaluate our methodology by using a strict privacy measure, empirical differential privacy and a newly defined risk measure, the probability of range identification, which directly measures attribute disclosure risk. We illustrate our results with the US Census Bureau's quarterly workforce indicators.

Suggested Citation

  • Matthew J. Schneider & John M. Abowd, 2015. "A new method for protecting interrelated time series with Bayesian prior distributions and synthetic data," Journal of the Royal Statistical Society Series A, Royal Statistical Society, vol. 178(4), pages 963-975, October.
  • Handle: RePEc:bla:jorssa:v:178:y:2015:i:4:p:963-975
    as

    Download full text from publisher

    File URL: http://hdl.handle.net/10.1111/rssa.2015.178.issue-4
    Download Restriction: Access to full text is restricted to subscribers.
    ---><---

    As the access to this document is restricted, you may want to search for a different version of it.

    Citations

    Citations are extracted by the CitEc Project, subscribe to its RSS feed for this item.
    as


    Cited by:

    1. Carbajal De Nova, Carolina, 2017. "Synthetic data. A proposed method for applied risk management," MPRA Paper 77978, University Library of Munich, Germany, revised 28 Mar 2017.
    2. De Nova, Carolina Carbajal, 2021. "Synthetic data. A novel proposed method for applied risk management," 95th Annual Conference, March 29-30, 2021, Warwick, UK (Hybrid) 311085, Agricultural Economics Society - AES.
    3. Piyush Anand & Clarence Lee, 2023. "Using Deep Learning to Overcome Privacy and Scalability Issues in Customer Data Transfer," Marketing Science, INFORMS, vol. 42(1), pages 189-207, January.
    4. Matthew J. Schneider & Sharan Jagpal & Sachin Gupta & Shaobo Li & Yan Yu, 2018. "A Flexible Method for Protecting Marketing Data: An Application to Point-of-Sale Data," Marketing Science, INFORMS, vol. 37(1), pages 153-171, January.

    More about this item

    Statistics

    Access and download statistics

    Corrections

    All material on this site has been provided by the respective publishers and authors. You can help correct errors and omissions. When requesting a correction, please mention this item's handle: RePEc:bla:jorssa:v:178:y:2015:i:4:p:963-975. See general information about how to correct material in RePEc.

    If you have authored this item and are not yet registered with RePEc, we encourage you to do it here. This allows to link your profile to this item. It also allows you to accept potential citations to this item that we are uncertain about.

    We have no bibliographic references for this item. You can help adding them by using this form .

    If you know of missing items citing this one, you can help us creating those links by adding the relevant references in the same way as above, for each refering item. If you are a registered author of this item, you may also want to check the "citations" tab in your RePEc Author Service profile, as there may be some citations waiting for confirmation.

    For technical questions regarding this item, or to correct its authors, title, abstract, bibliographic or download information, contact: Wiley Content Delivery (email available below). General contact details of provider: https://edirc.repec.org/data/rssssea.html .

    Please note that corrections may take a couple of weeks to filter through the various RePEc services.

    IDEAS is a RePEc service. RePEc uses bibliographic data supplied by the respective publishers.