IDEAS home Printed from https://ideas.repec.org/p/nbr/nberwo/23534.html
   My bibliography  Save this paper

A Framework for Sharing Confidential Research Data, Applied to Investigating Differential Pay by Race in the U. S. Government

Author

Listed:
  • Andrés F. Barrientos
  • Alexander Bolton
  • Tom Balmat
  • Jerome P. Reiter
  • John M. de Figueiredo
  • Ashwin Machanavajjhala
  • Yan Chen
  • Charles Kneifel
  • Mark DeLong

Abstract

Data stewards seeking to provide access to large-scale social science data face a difficult challenge. They have to share data in ways that protect privacy and confidentiality, are informative for many analyses and purposes, and are relatively straightforward to use by data analysts. We present a framework for addressing this challenge. The framework uses an integrated system that includes fully synthetic data intended for wide access, coupled with means for approved users to access the confidential data via secure remote access solutions, glued together by verification servers that allow users to assess the quality of their analyses with the synthetic data. We apply this framework to data on the careers of employees of the U. S. federal government, studying differentials in pay by race. The integrated system performs as intended, allowing users to explore the synthetic data for potential pay differentials and learn through verifications which findings in the synthetic data hold up in the confidential data and which do not. We find differentials across races; for example, the gap between black and white female federal employees' pay increased over the time period. We present models for generating synthetic careers and differentially private algorithms for verification of regression results.

Suggested Citation

  • Andrés F. Barrientos & Alexander Bolton & Tom Balmat & Jerome P. Reiter & John M. de Figueiredo & Ashwin Machanavajjhala & Yan Chen & Charles Kneifel & Mark DeLong, 2017. "A Framework for Sharing Confidential Research Data, Applied to Investigating Differential Pay by Race in the U. S. Government," NBER Working Papers 23534, National Bureau of Economic Research, Inc.
  • Handle: RePEc:nbr:nberwo:23534
    Note: LE LS PE TWP
    as

    Download full text from publisher

    File URL: http://www.nber.org/papers/w23534.pdf
    Download Restriction: no

    References listed on IDEAS

    as
    1. Borjas, George J, 1980. "Wage Determination in the Federal Government: The Role of Constituents and Bureaucrats," Journal of Political Economy, University of Chicago Press, vol. 88(6), pages 1110-1147, December.
    2. Ryoichi Sakano, 2002. "Are black and white income distributions converging? time series analysis," The Review of Black Political Economy, Springer;National Economic Association, vol. 30(1), pages 91-106, June.
    3. repec:eee:labchp:v:3:y:1999:i:pc:p:3143-3259 is not listed on IDEAS
    4. Gary A. Hoover & Ryan A. Compton & Daniel C. Giedeman, 2015. "The Impact of Economic Freedom on the Black/White Income Gap," American Economic Review, American Economic Association, vol. 105(5), pages 587-592, May.
    5. David Card & Thomas Lemieux, 1994. "Changing Wage Structure and Black-White Wage Differentials: A Longitudinal Analysis," Working Papers 701, Princeton University, Department of Economics, Industrial Relations Section..
    6. A. Colin Cameron & Douglas L. Miller, 2015. "A Practitioner’s Guide to Cluster-Robust Inference," Journal of Human Resources, University of Wisconsin Press, vol. 50(2), pages 317-372.
    7. Reiter, Jerome P. & Raghunathan, Trivellore E., 2007. "The Multiple Adaptations of Multiple Imputation," Journal of the American Statistical Association, American Statistical Association, vol. 102, pages 1462-1471, December.
    8. Card, David & Lemieux, Thomas, 1994. "Changing Wage Structure and Black-White Wage Differentials," American Economic Review, American Economic Association, vol. 84(2), pages 29-33, May.
    9. Jerome P. Reiter, 2005. "Releasing multiply imputed, synthetic public use microdata: an illustration and empirical study," Journal of the Royal Statistical Society Series A, Royal Statistical Society, vol. 168(1), pages 185-205, January.
    10. Altonji, Joseph G. & Blank, Rebecca M., 1999. "Race and gender in the labor market," Handbook of Labor Economics,in: O. Ashenfelter & D. Card (ed.), Handbook of Labor Economics, edition 1, volume 3, chapter 48, pages 3143-3259 Elsevier.
    11. Borjas, George J, 1982. "The Politics of Employment Discrimination in the Federal Bureaucracy," Journal of Law and Economics, University of Chicago Press, vol. 25(2), pages 271-299, October.
    12. Reiter, Jerome P. & Oganian, Anna & Karr, Alan F., 2009. "Verification servers: Enabling analysts to assess the quality of inferences from public use data," Computational Statistics & Data Analysis, Elsevier, vol. 53(4), pages 1475-1482, February.
    13. Alexander Bolton & John M. de Figueiredo & David E. Lewis, 2016. "Elections, Ideology, and Turnover in the U.S. Federal Government," NBER Working Papers 22932, National Bureau of Economic Research, Inc.
    14. Dan Black & Natalia Kolesnikova & Seth Sanders & Lowell Taylor, 2013. "The role of location in evaluating racial wage disparity," IZA Journal of Labor Economics, Springer;Forschungsinstitut zur Zukunft der Arbeit GmbH (IZA), vol. 2(1), pages 1-18, December.
    15. David Card & Thomas Lemieux, 1994. "Changing Wage Structure and Black-White Wage Differentials: A Longitudinal Analysis," Working Papers 701, Princeton University, Department of Economics, Industrial Relations Section..
    16. George J. Borjas, 1983. "The Measurement of Race and Gender Wage Differentials: Evidence from the Federal Sector," ILR Review, Cornell University, ILR School, vol. 37(1), pages 79-91, October.
    17. Drechsler, Jörg & Dundler, Agnes & Bender, Stefan & Rässler, Susanne & Zwick, Thomas, 2007. "A new approach for disclosure control in the IAB Establishment Panel : multiple imputation for a better data access," IAB Discussion Paper 200711, Institut für Arbeitsmarkt- und Berufsforschung (IAB), Nürnberg [Institute for Employment Research, Nuremberg, Germany].
    Full references (including those not matched with items on IDEAS)

    More about this item

    JEL classification:

    • C51 - Mathematical and Quantitative Methods - - Econometric Modeling - - - Model Construction and Estimation
    • C53 - Mathematical and Quantitative Methods - - Econometric Modeling - - - Forecasting and Prediction Models; Simulation Methods
    • C55 - Mathematical and Quantitative Methods - - Econometric Modeling - - - Large Data Sets: Modeling and Analysis
    • C81 - Mathematical and Quantitative Methods - - Data Collection and Data Estimation Methodology; Computer Programs - - - Methodology for Collecting, Estimating, and Organizing Microeconomic Data; Data Access
    • J15 - Labor and Demographic Economics - - Demographic Economics - - - Economics of Minorities, Races, Indigenous Peoples, and Immigrants; Non-labor Discrimination
    • J45 - Labor and Demographic Economics - - Particular Labor Markets - - - Public Sector Labor Markets

    NEP fields

    This paper has been announced in the following NEP Reports:

    Statistics

    Access and download statistics

    Corrections

    All material on this site has been provided by the respective publishers and authors. You can help correct errors and omissions. When requesting a correction, please mention this item's handle: RePEc:nbr:nberwo:23534. See general information about how to correct material in RePEc.

    For technical questions regarding this item, or to correct its authors, title, abstract, bibliographic or download information, contact: (). General contact details of provider: http://edirc.repec.org/data/nberrus.html .

    If you have authored this item and are not yet registered with RePEc, we encourage you to do it here. This allows to link your profile to this item. It also allows you to accept potential citations to this item that we are uncertain about.

    If CitEc recognized a reference but did not link an item in RePEc to it, you can help with this form .

    If you know of missing items citing this one, you can help us creating those links by adding the relevant references in the same way as above, for each refering item. If you are a registered author of this item, you may also want to check the "citations" tab in your RePEc Author Service profile, as there may be some citations waiting for confirmation.

    Please note that corrections may take a couple of weeks to filter through the various RePEc services.

    IDEAS is a RePEc service hosted by the Research Division of the Federal Reserve Bank of St. Louis . RePEc uses bibliographic data supplied by the respective publishers.