IDEAS home Printed from https://ideas.repec.org/p/arx/papers/2403.12108.html
   My bibliography  Save this paper

Does AI help humans make better decisions? A methodological framework for experimental evaluation

Author

Listed:
  • Eli Ben-Michael
  • D. James Greiner
  • Melody Huang
  • Kosuke Imai
  • Zhichao Jiang
  • Sooahn Shin

Abstract

The use of Artificial Intelligence (AI) based on data-driven algorithms has become ubiquitous in today's society. Yet, in many cases and especially when stakes are high, humans still make final decisions. The critical question, therefore, is whether AI helps humans make better decisions as compared to a human alone or AI an alone. We introduce a new methodological framework that can be used to answer experimentally this question with no additional assumptions. We measure a decision maker's ability to make correct decisions using standard classification metrics based on the baseline potential outcome. We consider a single-blinded experimental design, in which the provision of AI-generated recommendations is randomized across cases with a human making final decisions. Under this experimental design, we show how to compare the performance of three alternative decision-making systems--human-alone, human-with-AI, and AI-alone. We apply the proposed methodology to the data from our own randomized controlled trial of a pretrial risk assessment instrument. We find that AI recommendations do not improve the classification accuracy of a judge's decision to impose cash bail. Our analysis also shows that AI-alone decisions generally perform worse than human decisions with or without AI assistance. Finally, AI recommendations tend to impose cash bail on non-white arrestees more often than necessary when compared to white arrestees.

Suggested Citation

  • Eli Ben-Michael & D. James Greiner & Melody Huang & Kosuke Imai & Zhichao Jiang & Sooahn Shin, 2024. "Does AI help humans make better decisions? A methodological framework for experimental evaluation," Papers 2403.12108, arXiv.org.
  • Handle: RePEc:arx:papers:2403.12108
    as

    Download full text from publisher

    File URL: http://arxiv.org/pdf/2403.12108
    File Function: Latest version
    Download Restriction: no
    ---><---

    References listed on IDEAS

    as
    1. Mitchell Hoffman & Lisa B Kahn & Danielle Li, 2018. "Discretion in Hiring," The Quarterly Journal of Economics, President and Fellows of Harvard College, vol. 133(2), pages 765-800.
    2. David Arnold & Will Dobbie & Peter Hull, 2022. "Measuring Racial Discrimination in Bail Decisions," American Economic Review, American Economic Association, vol. 112(9), pages 2992-3038, September.
    3. David Arnold & Will Dobbie & Peter Hull, 2021. "Measuring Racial Discrimination in Algorithms," AEA Papers and Proceedings, American Economic Association, vol. 111, pages 49-54, May.
    4. Will Dobbie & Jacob Goldin & Crystal S. Yang, 2018. "The Effects of Pretrial Detention on Conviction, Future Crime, and Employment: Evidence from Randomly Assigned Judges," American Economic Review, American Economic Association, vol. 108(2), pages 201-240, February.
    5. Victoria Angelova & Will S. Dobbie & Crystal Yang, 2023. "Algorithmic Recommendations and Human Discretion," NBER Working Papers 31747, National Bureau of Economic Research, Inc.
    6. Sharad Goel & Justin M. Rao & Ravi Shroff, 2016. "Personalized Risk Assessments in the Criminal Justice System," American Economic Review, American Economic Association, vol. 106(5), pages 119-123, May.
    7. Jon Kleinberg & Himabindu Lakkaraju & Jure Leskovec & Jens Ludwig & Sendhil Mullainathan, 2018. "Human Decisions and Machine Predictions," The Quarterly Journal of Economics, President and Fellows of Harvard College, vol. 133(1), pages 237-293.
    8. Richard A. Berk & Susan B. Sorenson & Geoffrey Barnes, 2016. "Forecasting Domestic Violence: A Machine Learning Approach to Help Inform Arraignment Decisions," Journal of Empirical Legal Studies, John Wiley & Sons, vol. 13(1), pages 94-115, March.
    Full references (including those not matched with items on IDEAS)

    Most related items

    These are the items that most often cite the same works as this one and are cited by the same works as this one.
    1. Joshua Grossman & Julian Nyarko & Sharad Goel, 2023. "Racial bias as a multi‐stage, multi‐actor problem: An analysis of pretrial detention," Journal of Empirical Legal Studies, John Wiley & Sons, vol. 20(1), pages 86-133, March.
    2. Bharti, Nitin Kumar & Roy, Sutanuka, 2023. "The early origins of judicial stringency in bail decisions: Evidence from early childhood exposure to Hindu-Muslim riots in India," Journal of Public Economics, Elsevier, vol. 221(C).
    3. Ivan A. Canay & Magne Mogstad & Jack Mountjoy, 2020. "On the Use of Outcome Tests for Detecting Bias in Decision Making," NBER Working Papers 27802, National Bureau of Economic Research, Inc.
    4. Jens Ludwig & Sendhil Mullainathan, 2021. "Fragile Algorithms and Fallible Decision-Makers: Lessons from the Justice System," Journal of Economic Perspectives, American Economic Association, vol. 35(4), pages 71-96, Fall.
    5. Isil Erel & Léa H Stern & Chenhao Tan & Michael S Weisbach, 2021. "Selecting Directors Using Machine Learning," NBER Chapters, in: Big Data: Long-Term Implications for Financial Markets and Firms, pages 3226-3264, National Bureau of Economic Research, Inc.
    6. Ginther, Donna K. & Heggeness, Misty L., 2020. "Administrative discretion in scientific funding: Evidence from a prestigious postdoctoral training program✰," Research Policy, Elsevier, vol. 49(4).
    7. Nicolás Grau & Damián Vergara, "undated". "A Simple Test for Prejudice in Decision Processes: The Prediction-Based Outcome Test," Working Papers wp493, University of Chile, Department of Economics.
    8. Xiaochen Hu & Xudong Zhang & Nicholas Lovrich, 2021. "Public perceptions of police behavior during traffic stops: logistic regression and machine learning approaches compared," Journal of Computational Social Science, Springer, vol. 4(1), pages 355-380, May.
    9. Shroff, Ravi & Vamvourellis, Konstantinos, 2022. "Pretrial release judgments and decision fatigue," LSE Research Online Documents on Economics 117579, London School of Economics and Political Science, LSE Library.
    10. Chugunova, Marina & Sele, Daniela, 2022. "We and It: An interdisciplinary review of the experimental evidence on how humans interact with machines," Journal of Behavioral and Experimental Economics (formerly The Journal of Socio-Economics), Elsevier, vol. 99(C).
    11. Stevenson, Megan T. & Doleac, Jennifer, 2019. "Algorithmic Risk Assessment in the Hands of Humans," IZA Discussion Papers 12853, Institute of Labor Economics (IZA).
    12. David Almog & Romain Gauriot & Lionel Page & Daniel Martin, 2024. "AI Oversight and Human Mistakes: Evidence from Centre Court," Papers 2401.16754, arXiv.org, revised Feb 2024.
    13. Danielle Li & Lindsey R. Raymond & Peter Bergman, 2020. "Hiring as Exploration," NBER Working Papers 27736, National Bureau of Economic Research, Inc.
    14. Richard Berk, 2019. "Accuracy and Fairness for Juvenile Justice Risk Assessments," Journal of Empirical Legal Studies, John Wiley & Sons, vol. 16(1), pages 175-194, March.
    15. Bauer, Kevin & Gill, Andrej, 2021. "Mirror, mirror on the wall: Machine predictions and self-fulfilling prophecies," SAFE Working Paper Series 313, Leibniz Institute for Financial Research SAFE.
    16. Runshan Fu & Ginger Zhe Jin & Meng Liu, 2022. "Does Human-algorithm Feedback Loop Lead to Error Propagation? Evidence from Zillow’s Zestimate," NBER Working Papers 29880, National Bureau of Economic Research, Inc.
    17. Fumagalli, Elena & Rezaei, Sarah & Salomons, Anna, 2022. "OK computer: Worker perceptions of algorithmic recruitment," Research Policy, Elsevier, vol. 51(2).
    18. Bauer, Kevin & Pfeuffer, Nicolas & Abdel-Karim, Benjamin M. & Hinz, Oliver & Kosfeld, Michael, 2020. "The terminator of social welfare? The economic consequences of algorithmic discrimination," SAFE Working Paper Series 287, Leibniz Institute for Financial Research SAFE.
    19. Brendan O'Flaherty & Rajiv Sethi & Morgan Williams, 2024. "The nature, detection, and avoidance of harmful discrimination in criminal justice," Journal of Policy Analysis and Management, John Wiley & Sons, Ltd., vol. 43(1), pages 289-320, January.
    20. Bhattacharya, D. & Shvets, J., 2022. "Inferring the Performance Diversity Trade-Off in University Admissions: Evidence from Cambridge," Cambridge Working Papers in Economics 2238, Faculty of Economics, University of Cambridge.

    More about this item

    NEP fields

    This paper has been announced in the following NEP Reports:

    Statistics

    Access and download statistics

    Corrections

    All material on this site has been provided by the respective publishers and authors. You can help correct errors and omissions. When requesting a correction, please mention this item's handle: RePEc:arx:papers:2403.12108. See general information about how to correct material in RePEc.

    If you have authored this item and are not yet registered with RePEc, we encourage you to do it here. This allows to link your profile to this item. It also allows you to accept potential citations to this item that we are uncertain about.

    If CitEc recognized a bibliographic reference but did not link an item in RePEc to it, you can help with this form .

    If you know of missing items citing this one, you can help us creating those links by adding the relevant references in the same way as above, for each refering item. If you are a registered author of this item, you may also want to check the "citations" tab in your RePEc Author Service profile, as there may be some citations waiting for confirmation.

    For technical questions regarding this item, or to correct its authors, title, abstract, bibliographic or download information, contact: arXiv administrators (email available below). General contact details of provider: http://arxiv.org/ .

    Please note that corrections may take a couple of weeks to filter through the various RePEc services.

    IDEAS is a RePEc service. RePEc uses bibliographic data supplied by the respective publishers.