IDEAS home Printed from https://ideas.repec.org/a/inm/ormnsc/v68y2022i3p1959-1981.html
   My bibliography  Save this article

Assessing Algorithmic Fairness with Unobserved Protected Class Using Data Combination

Author

Listed:
  • Nathan Kallus

    (Cornell University, Ithaca, New York 14850)

  • Xiaojie Mao

    (Cornell University, Ithaca, New York 14850)

  • Angela Zhou

    (Cornell University, Ithaca, New York 14850)

Abstract

The increasing impact of algorithmic decisions on people’s lives compels us to scrutinize their fairness and, in particular, the disparate impacts that ostensibly color-blind algorithms can have on different groups. Examples include credit decisioning, hiring, advertising, criminal justice, personalized medicine, and targeted policy making, where in some cases legislative or regulatory frameworks for fairness exist and define specific protected classes. In this paper we study a fundamental challenge to assessing disparate impacts in practice: protected class membership is often not observed in the data. This is particularly a problem in lending and healthcare. We consider the use of an auxiliary data set, such as the U.S. census, to construct models that predict the protected class from proxy variables, such as surname and geolocation. We show that even with such data, a variety of common disparity measures are generally unidentifiable, providing a new perspective on the documented biases of popular proxy-based methods. We provide exact characterizations of the tightest possible set of all possible true disparities that are consistent with the data (and possibly additional assumptions). We further provide optimization-based algorithms for computing and visualizing these sets and statistical tools to assess sampling uncertainty. Together, these enable reliable and robust assessments of disparities—an important tool when disparity assessment can have far-reaching policy implications. We demonstrate this in two case studies with real data: mortgage lending and personalized medicine dosing.

Suggested Citation

  • Nathan Kallus & Xiaojie Mao & Angela Zhou, 2022. "Assessing Algorithmic Fairness with Unobserved Protected Class Using Data Combination," Management Science, INFORMS, vol. 68(3), pages 1959-1981, March.
  • Handle: RePEc:inm:ormnsc:v:68:y:2022:i:3:p:1959-1981
    DOI: 10.1287/mnsc.2020.3850
    as

    Download full text from publisher

    File URL: http://dx.doi.org/10.1287/mnsc.2020.3850
    Download Restriction: no

    File URL: https://libkey.io/10.1287/mnsc.2020.3850?utm_source=ideas
    LibKey link: if access is restricted and if your library uses this service, LibKey will redirect you to where you can use your library subscription to access this item
    ---><---

    References listed on IDEAS

    as
    1. Victor Chernozhukov & Denis Chetverikov & Mert Demirer & Esther Duflo & Christian Hansen & Whitney Newey & James Robins, 2018. "Double/debiased machine learning for treatment and structural parameters," Econometrics Journal, Royal Economic Society, vol. 21(1), pages 1-68, February.
    2. Federico Ciliberto & Elie Tamer, 2009. "Market Structure and Multiple Equilibria in Airline Markets," Econometrica, Econometric Society, vol. 77(6), pages 1791-1828, November.
    3. Ridder, Geert & Moffitt, Robert, 2007. "The Econometrics of Data Combination," Handbook of Econometrics, in: J.J. Heckman & E.E. Leamer (ed.), Handbook of Econometrics, edition 1, volume 6, chapter 75, Elsevier.
    4. Avi Goldfarb & Catherine Tucker, 2011. "Online Display Advertising: Targeting and Obtrusiveness," Marketing Science, INFORMS, vol. 30(3), pages 389-404, 05-06.
    5. Anja Lambrecht & Catherine Tucker, 2019. "Algorithmic Bias? An Empirical Study of Apparent Gender-Based Discrimination in the Display of STEM Career Ads," Management Science, INFORMS, vol. 65(7), pages 2966-2981, July.
    6. Hamsa Bastani & Mohsen Bayati, 2020. "Online Decision Making with High-Dimensional Covariates," Operations Research, INFORMS, vol. 68(1), pages 276-294, January.
    7. Ganesh Iyer & David Soberman & J. Miguel Villas-Boas, 2005. "The Targeting of Advertising," Marketing Science, INFORMS, vol. 24(3), pages 461-476, May.
    8. Avi Goldfarb & Catherine Tucker, 2011. "Rejoinder--Implications of "Online Display Advertising: Targeting and Obtrusiveness"," Marketing Science, INFORMS, vol. 30(3), pages 413-415, 05-06.
    9. Keisuke Hirano & Jack R. Porter, 2012. "Impossibility Results for Nondifferentiable Functionals," Econometrica, Econometric Society, vol. 80(4), pages 1769-1790, July.
    10. Jon Wakefield, 2004. "Ecological inference for 2 × 2 tables (with discussion)," Journal of the Royal Statistical Society Series A, Royal Statistical Society, vol. 167(3), pages 385-445, July.
    11. Yanqin Fan & Robert Sherman & Matthew Shum, 2014. "Identifying Treatment Effects Under Data Combination," Econometrica, Econometric Society, vol. 82(2), pages 811-822, March.
    12. Arie Beresteanu & Ilya Molchanov & Francesca Molinari, 2011. "Sharp Identification Regions in Models With Convex Moment Predictions," Econometrica, Econometric Society, vol. 79(6), pages 1785-1821, November.
    13. Imai, Kosuke & Khanna, Kabir, 2016. "Improving Ecological Inference by Predicting Individual Ethnicity from Voter Registration Records," Political Analysis, Cambridge University Press, vol. 24(2), pages 263-272, April.
    14. A. Charnes & W. W. Cooper, 1962. "Programming with linear fractional functionals," Naval Research Logistics Quarterly, John Wiley & Sons, vol. 9(3‐4), pages 181-186, September.
    15. Jon Wakefield, 2004. "Ecological inference for 2 × 2 tables," Journal of the Royal Statistical Society Series A, Royal Statistical Society, vol. 167(3), pages 385-425, July.
    Full references (including those not matched with items on IDEAS)

    Citations

    Citations are extracted by the CitEc Project, subscribe to its RSS feed for this item.
    as


    Cited by:

    1. Cohle, Zachary & Ortega, Alberto, 2023. "The effect of the opioid crisis on patenting," Journal of Economic Behavior & Organization, Elsevier, vol. 214(C), pages 493-521.
    2. Benjamin Lu & Jia Wan & Derek Ouyang & Jacob Goldin & Daniel E. Ho, 2024. "Quantifying the Uncertainty of Imputed Demographic Disparity Estimates: The Dual Bootstrap," NBER Chapters, in: Race, Ethnicity, and Economic Statistics for the 21st Century, National Bureau of Economic Research, Inc.

    Most related items

    These are the items that most often cite the same works as this one and are cited by the same works as this one.
    1. Christian Bontemps & Thierry Magnac & Eric Maurin, 2012. "Set Identified Linear Models," Econometrica, Econometric Society, vol. 80(3), pages 1129-1155, May.
    2. Alex Jiyoung Kim & Subramanian Balachander, 2023. "Coordinating traditional media advertising and online advertising in brand marketing," Production and Operations Management, Production and Operations Management Society, vol. 32(6), pages 1865-1879, June.
    3. Francesca Molinari, 2020. "Microeconometrics with Partial Identi?cation," CeMMAP working papers CWP15/20, Centre for Microdata Methods and Practice, Institute for Fiscal Studies.
    4. Semenova, Vira, 2023. "Debiased machine learning of set-identified linear models," Journal of Econometrics, Elsevier, vol. 235(2), pages 1725-1746.
    5. Nathan M. Fong, 2017. "How Targeting Affects Customer Search: A Field Experiment," Management Science, INFORMS, vol. 63(7), pages 2353-2364, July.
    6. Hiroaki Kaido & Yi Zhang, 2019. "Robust Likelihood Ratio Tests for Incomplete Economic Models," Papers 1910.04610, arXiv.org, revised Dec 2019.
    7. Guitart, Ivan A. & Hervet, Guillaume, 2017. "The impact of contextual television ads on online conversions: An application in the insurance industry," International Journal of Research in Marketing, Elsevier, vol. 34(2), pages 480-498.
    8. Magnac, Thierry, 2013. "Identification partielle : méthodes et conséquences pour les applications empiriques," L'Actualité Economique, Société Canadienne de Science Economique, vol. 89(4), pages 233-258, Décembre.
    9. Henk Kox & Bas Straathof & Gijsbert Zwart, 2017. "Targeted advertising, platform competition, and privacy," Journal of Economics & Management Strategy, Wiley Blackwell, vol. 26(3), pages 557-570, September.
    10. Jan Krämer & Daniel Schnurr & Michael Wohlfarth, 2019. "Winners, Losers, and Facebook: The Role of Social Logins in the Online Advertising Ecosystem," Management Science, INFORMS, vol. 65(4), pages 1678-1699, April.
    11. Byun, Kyung-Ah & Hong, JungHwa & William James, Kevin, 2023. "When does a goal-appeal match affect consumer satisfaction? Examining the work and play context," Journal of Business Research, Elsevier, vol. 158(C).
    12. Amin Sayedi, 2018. "Real-Time Bidding in Online Display Advertising," Marketing Science, INFORMS, vol. 37(4), pages 553-568, August.
    13. Zhang, Jianqiang & He, Xiuli, 2019. "Targeted advertising by asymmetric firms," Omega, Elsevier, vol. 89(C), pages 136-150.
    14. Peitz, Martin & Reisinger, Markus, 2014. "The Economics of Internet Media," Working Papers 14-23, University of Mannheim, Department of Economics.
    15. Karle, Heiko & Peitz, Martin, 2017. "De-targeting: Advertising an assortment of products to loss-averse consumers," European Economic Review, Elsevier, vol. 95(C), pages 103-124.
    16. Khim-Yong Goh & Kai-Lung Hui & Ivan P. L. Png, 2015. "Privacy and Marketing Externalities: Evidence from Do Not Call," Management Science, INFORMS, vol. 61(12), pages 2982-3000, December.
    17. Francesca Molinari, 2019. "Econometrics with Partial Identification," CeMMAP working papers CWP25/19, Centre for Microdata Methods and Practice, Institute for Fiscal Studies.
    18. Kaido, Hiroaki, 2017. "Asymptotically Efficient Estimation Of Weighted Average Derivatives With An Interval Censored Variable," Econometric Theory, Cambridge University Press, vol. 33(5), pages 1218-1241, October.
    19. Victor Chernozhukov & Denis Chetverikov & Kengo Kato, 2013. "Testing Many Moment Inequalities," CeMMAP working papers 65/13, Institute for Fiscal Studies.
    20. Shun-Yang Lee & Julian Runge & Daniel Yoo & Yakov Bart & Anett Gyurak & J. W. Schneider, 2023. "COVID-19 Demand Shocks Revisited: Did Advertising Technology Help Mitigate Adverse Consequences for Small and Midsize Businesses?," Papers 2307.09035, arXiv.org, revised Jan 2024.

    Corrections

    All material on this site has been provided by the respective publishers and authors. You can help correct errors and omissions. When requesting a correction, please mention this item's handle: RePEc:inm:ormnsc:v:68:y:2022:i:3:p:1959-1981. See general information about how to correct material in RePEc.

    If you have authored this item and are not yet registered with RePEc, we encourage you to do it here. This allows to link your profile to this item. It also allows you to accept potential citations to this item that we are uncertain about.

    If CitEc recognized a bibliographic reference but did not link an item in RePEc to it, you can help with this form .

    If you know of missing items citing this one, you can help us creating those links by adding the relevant references in the same way as above, for each refering item. If you are a registered author of this item, you may also want to check the "citations" tab in your RePEc Author Service profile, as there may be some citations waiting for confirmation.

    For technical questions regarding this item, or to correct its authors, title, abstract, bibliographic or download information, contact: Chris Asher (email available below). General contact details of provider: https://edirc.repec.org/data/inforea.html .

    Please note that corrections may take a couple of weeks to filter through the various RePEc services.

    IDEAS is a RePEc service. RePEc uses bibliographic data supplied by the respective publishers.