IDEAS home Printed from https://ideas.repec.org/p/pri/crcwel/wp18-09-ff.html
   My bibliography  Save this paper

Privacy, ethics, and data access: A case study of the Fragile Families Challenge

Author

Listed:
  • Ian Lundberg

    (Princeton University)

  • Arvind Narayanan

    (Princeton University)

  • Karen Levy

    (Cornell University)

  • Matthew Salganik

    (Princeton University)

Abstract

Stewards of social science data face a fundamental tension. On one hand, they want to make their data accessible to as many researchers as possible to facilitate new discoveries. At the same time, they want to restrict access to their data as much as possible in order to protect the people represented in the data. In this paper, we provide a case study addressing this common tension in an uncommon setting: the Fragile Families Challenge, a scientific mass collaboration designed to yield insights that could improve the lives of disadvantaged children in the United States. We describe our process of threat modeling, threat mitigation, and third-party guidance.We also describe the ethical principles that formed the basis of our process. We are open about our process and the trade-offs that we made in the hopes that others can improve on what we have done.

Suggested Citation

  • Ian Lundberg & Arvind Narayanan & Karen Levy & Matthew Salganik, 2018. "Privacy, ethics, and data access: A case study of the Fragile Families Challenge," Working Papers wp18-09-ff, Princeton University, School of Public and International Affairs, Center for Research on Child Wellbeing..
  • Handle: RePEc:pri:crcwel:wp18-09-ff
    as

    Download full text from publisher

    File URL: https://fragilefamilies.princeton.edu/sites/fragilefamilies/files/wp18-09-ff.pdf
    Download Restriction: no
    ---><---

    References listed on IDEAS

    as
    1. Jon Kleinberg & Jens Ludwig & Sendhil Mullainathan & Ziad Obermeyer, 2015. "Prediction Policy Problems," American Economic Review, American Economic Association, vol. 105(5), pages 491-495, May.
    2. Freese, Jeremy & Peterson, David, 2017. "Replication in Social Science," SocArXiv 5bck9, Center for Open Science.
    3. Christopher Wildeman, 2009. "Parental imprisonment, the prison boom, and the concentration of childhood disadvantage," Demography, Springer;Population Association of America (PAA), vol. 46(2), pages 265-280, May.
    4. Lawrence H. Cox & Alan F. Karr & Satkartar K. Kinney, 2011. "Risk‐Utility Paradigms for Statistical Disclosure Limitation: How to Think, But Not How to Act," International Statistical Review, International Statistical Institute, vol. 79(2), pages 160-183, August.
    5. Jon Kleinberg & Himabindu Lakkaraju & Jure Leskovec & Jens Ludwig & Sendhil Mullainathan, 2018. "Human Decisions and Machine Predictions," The Quarterly Journal of Economics, President and Fellows of Harvard College, vol. 133(1), pages 237-293.
    6. Satkartar K. Kinney & Jerome P. Reiter & Arnold P. Reznek & Javier Miranda & Ron S. Jarmin & John M. Abowd, 2011. "Towards Unrestricted Public Use Business Microdata: The Synthetic Longitudinal Business Database," International Statistical Review, International Statistical Institute, vol. 79(3), pages 362-384, December.
    7. Reichman, Nancy E. & Teitler, Julien O. & Garfinkel, Irwin & McLanahan, Sara S., 2001. "Fragile Families: sample and design," Children and Youth Services Review, Elsevier, vol. 23(4-5), pages 303-326.
    8. Sendhil Mullainathan & Jann Spiess, 2017. "Machine Learning: An Applied Econometric Approach," Journal of Economic Perspectives, American Economic Association, vol. 31(2), pages 87-106, Spring.
    9. Karr, A.F. & Kohnen, C.N. & Oganian, A. & Reiter, J.P. & Sanil, A.P., 2006. "A Framework for Evaluating the Utility of Data Altered to Protect Confidentiality," The American Statistician, American Statistical Association, vol. 60, pages 224-232, August.
    10. Chris Skinner, 2012. "Statistical Disclosure Risk: Separating Potential and Harm," International Statistical Review, International Statistical Institute, vol. 80(3), pages 349-368, December.
    Full references (including those not matched with items on IDEAS)

    Citations

    Citations are extracted by the CitEc Project, subscribe to its RSS feed for this item.
    as


    Cited by:

    1. Watts, Duncan J & Beck, Emorie D & Bienenstock, Elisa Jayne & Bowers, Jake & Frank, Aaron & Grubesic, Anthony & Hofman, Jake M. & Rohrer, Julia Marie & Salganik, Matthew, 2018. "Explanation, prediction, and causality: Three sides of the same coin?," OSF Preprints u6vz5, Center for Open Science.

    Most related items

    These are the items that most often cite the same works as this one and are cited by the same works as this one.
    1. Sophie-Charlotte Klose & Johannes Lederer, 2020. "A Pipeline for Variable Selection and False Discovery Rate Control With an Application in Labor Economics," Papers 2006.12296, arXiv.org, revised Jun 2020.
    2. Delogu, Marco & Lagravinese, Raffaele & Paolini, Dimitri & Resce, Giuliano, 2024. "Predicting dropout from higher education: Evidence from Italy," Economic Modelling, Elsevier, vol. 130(C).
    3. Isil Erel & Léa H Stern & Chenhao Tan & Michael S Weisbach, 2021. "Selecting Directors Using Machine Learning," NBER Chapters, in: Big Data: Long-Term Implications for Financial Markets and Firms, pages 3226-3264, National Bureau of Economic Research, Inc.
    4. McKenzie, David & Sansone, Dario, 2017. "Man vs. Machine in Predicting Successful Entrepreneurs: Evidence from a Business Plan Competition in Nigeria," CEPR Discussion Papers 12523, C.E.P.R. Discussion Papers.
    5. Andini, Monica & Boldrini, Michela & Ciani, Emanuele & de Blasio, Guido & D'Ignazio, Alessio & Paladini, Andrea, 2022. "Machine learning in the service of policy targeting: The case of public credit guarantees," Journal of Economic Behavior & Organization, Elsevier, vol. 198(C), pages 434-475.
    6. de Blasio, Guido & D'Ignazio, Alessio & Letta, Marco, 2022. "Gotham city. Predicting ‘corrupted’ municipalities with machine learning," Technological Forecasting and Social Change, Elsevier, vol. 184(C).
    7. McKenzie, David & Sansone, Dario, 2019. "Predicting entrepreneurial success is hard: Evidence from a business plan competition in Nigeria," Journal of Development Economics, Elsevier, vol. 141(C).
    8. Vitezslav Titl & Deni Mazrekaj & Fritz Schiltz, 2024. "Identifying Politically Connected Firms: A Machine Learning Approach," Oxford Bulletin of Economics and Statistics, Department of Economics, University of Oxford, vol. 86(1), pages 137-155, February.
    9. Alessandra Garbero & Marco Letta, 2022. "Predicting household resilience with machine learning: preliminary cross-country tests," Empirical Economics, Springer, vol. 63(4), pages 2057-2070, October.
    10. Lundberg, Ian & Brand, Jennie E. & Jeon, Nanum, 2022. "Researcher reasoning meets computational capacity: Machine learning for social science," SocArXiv s5zc8, Center for Open Science.
    11. Monica Andini & Emanuele Ciani & Guido de Blasio & Alessio D'Ignazio & Viola Salvestrini, 2017. "Targeting policy-compliers with machine learning: an application to a tax rebate programme in Italy," Temi di discussione (Economic working papers) 1158, Bank of Italy, Economic Research and International Relations Area.
    12. Jorge Mejia & Shawn Mankad & Anandasivam Gopal, 2019. "A for Effort? Using the Crowd to Identify Moral Hazard in New York City Restaurant Hygiene Inspections," Information Systems Research, INFORMS, vol. 30(4), pages 1363-1386, December.
    13. Gert Bijnens & Shyngys Karimov & Jozef Konings, 2023. "Does Automatic Wage Indexation Destroy Jobs? A Machine Learning Approach," De Economist, Springer, vol. 171(1), pages 85-117, March.
    14. Elliott Ash & Sergio Galletta & Tommaso Giommoni, 2021. "A Machine Learning Approach to Analyze and Support Anti-Corruption Policy," CESifo Working Paper Series 9015, CESifo.
    15. Andini, Monica & Ciani, Emanuele & de Blasio, Guido & D'Ignazio, Alessio & Salvestrini, Viola, 2018. "Targeting with machine learning: An application to a tax rebate program in Italy," Journal of Economic Behavior & Organization, Elsevier, vol. 156(C), pages 86-102.
    16. Guido de Blasio & Alessio D'Ignazio & Marco Letta, 2020. "Predicting Corruption Crimes with Machine Learning. A Study for the Italian Municipalities," Working Papers 16/20, Sapienza University of Rome, DISS.
    17. Allison Dwyer Emory, 2019. "Unintended Consequences: Protective State Policies and the Employment of Fathers with Criminal Records," Working Papers wp19-04-ff, Princeton University, School of Public and International Affairs, Center for Research on Child Wellbeing..
    18. Naguib, Costanza, 2019. "Estimating the Heterogeneous Impact of the Free Movement of Persons on Relative Wage Mobility," Economics Working Paper Series 1903, University of St. Gallen, School of Economics and Political Science.
    19. repec:pri:crcwel:wp12-10-ff is not listed on IDEAS
    20. Yucheng Yang & Zhong Zheng & Weinan E, 2020. "Interpretable Neural Networks for Panel Data Analysis in Economics," Papers 2010.05311, arXiv.org, revised Nov 2020.
    21. Erik Heilmann & Janosch Henze & Heike Wetzel, 2021. "Machine learning in energy forecasts with an application to high frequency electricity consumption data," MAGKS Papers on Economics 202135, Philipps-Universität Marburg, Faculty of Business Administration and Economics, Department of Economics (Volkswirtschaftliche Abteilung).

    More about this item

    JEL classification:

    • F13 - International Economics - - Trade - - - Trade Policy; International Trade Organizations

    Statistics

    Access and download statistics

    Corrections

    All material on this site has been provided by the respective publishers and authors. You can help correct errors and omissions. When requesting a correction, please mention this item's handle: RePEc:pri:crcwel:wp18-09-ff. See general information about how to correct material in RePEc.

    If you have authored this item and are not yet registered with RePEc, we encourage you to do it here. This allows to link your profile to this item. It also allows you to accept potential citations to this item that we are uncertain about.

    If CitEc recognized a bibliographic reference but did not link an item in RePEc to it, you can help with this form .

    If you know of missing items citing this one, you can help us creating those links by adding the relevant references in the same way as above, for each refering item. If you are a registered author of this item, you may also want to check the "citations" tab in your RePEc Author Service profile, as there may be some citations waiting for confirmation.

    For technical questions regarding this item, or to correct its authors, title, abstract, bibliographic or download information, contact: Bobray Bordelon (email available below). General contact details of provider: https://edirc.repec.org/data/ccprius.html .

    Please note that corrections may take a couple of weeks to filter through the various RePEc services.

    IDEAS is a RePEc service. RePEc uses bibliographic data supplied by the respective publishers.