IDEAS home Printed from https://ideas.repec.org/p/arx/papers/2306.11689.html

Statistical Tests for Replacing Human Decision Makers with Algorithms

Author

Listed:
  • Kai Feng
  • Han Hong
  • Ke Tang
  • Jingyuan Wang

Abstract

This paper proposes a statistical framework of using artificial intelligence to improve human decision making. The performance of each human decision maker is benchmarked against that of machine predictions. We replace the diagnoses made by a subset of the decision makers with the recommendation from the machine learning algorithm. We apply both a heuristic frequentist approach and a Bayesian posterior loss function approach to abnormal birth detection using a nationwide dataset of doctor diagnoses from prepregnancy checkups of reproductive age couples and pregnancy outcomes. We find that our algorithm on a test dataset results in a higher overall true positive rate and a lower false positive rate than the diagnoses made by doctors only.

Suggested Citation

  • Kai Feng & Han Hong & Ke Tang & Jingyuan Wang, 2023. "Statistical Tests for Replacing Human Decision Makers with Algorithms," Papers 2306.11689, arXiv.org, revised Dec 2024.
  • Handle: RePEc:arx:papers:2306.11689
    as

    Download full text from publisher

    File URL: http://arxiv.org/pdf/2306.11689
    File Function: Latest version
    Download Restriction: no
    ---><---

    Other versions of this item:

    References listed on IDEAS

    as
    1. Kohei Kawaguchi, 2021. "When Will Workers Follow an Algorithm? A Field Experiment with a Retail Business," Management Science, INFORMS, vol. 67(3), pages 1670-1695, March.
    2. Jiaming Zeng & Berk Ustun & Cynthia Rudin, 2017. "Interpretable classification models for recidivism prediction," Journal of the Royal Statistical Society Series A, Royal Statistical Society, vol. 180(3), pages 689-722, June.
    3. Eric Mbakop & Max Tabord‐Meehan, 2021. "Model Selection for Treatment Choice: Penalized Welfare Maximization," Econometrica, Econometric Society, vol. 89(2), pages 825-848, March.
    4. Erin M. Johnson & M. Marit Rehavi, 2016. "Physicians Treating Physicians: Information and Incentives in Childbirth," American Economic Journal: Economic Policy, American Economic Association, vol. 8(1), pages 115-141, February.
    5. Aaron Chalfin & Oren Danieli & Andrew Hillis & Zubin Jelveh & Michael Luca & Jens Ludwig & Sendhil Mullainathan, 2016. "Productivity and Selection of Human Capital with Machine Learning," American Economic Review, American Economic Association, vol. 106(5), pages 124-127, May.
    6. Sendhil Mullainathan & Jann Spiess, 2017. "Machine Learning: An Applied Econometric Approach," Journal of Economic Perspectives, American Economic Association, vol. 31(2), pages 87-106, Spring.
    7. Andre Esteva & Brett Kuprel & Roberto A. Novoa & Justin Ko & Susan M. Swetter & Helen M. Blau & Sebastian Thrun, 2017. "Correction: Corrigendum: Dermatologist-level classification of skin cancer with deep neural networks," Nature, Nature, vol. 546(7660), pages 686-686, June.
    8. Scott Mayer McKinney & Marcin Sieniek & Varun Godbole & Jonathan Godwin & Natasha Antropova & Hutan Ashrafian & Trevor Back & Mary Chesus & Greg S. Corrado & Ara Darzi & Mozziyar Etemadi & Florencia G, 2020. "International evaluation of an AI system for breast cancer screening," Nature, Nature, vol. 577(7788), pages 89-94, January.
    9. Andre Esteva & Brett Kuprel & Roberto A. Novoa & Justin Ko & Susan M. Swetter & Helen M. Blau & Sebastian Thrun, 2017. "Dermatologist-level classification of skin cancer with deep neural networks," Nature, Nature, vol. 542(7639), pages 115-118, February.
    10. Apaar Sadhwani & Kay Giesecke & Justin Sirignano, 2021. "Deep Learning for Mortgage Risk [The Subprime Virus]," Journal of Financial Econometrics, Oxford University Press, vol. 19(2), pages 313-368.
    11. Elliott, Graham & Lieli, Robert P., 2013. "Predicting binary outcomes," Journal of Econometrics, Elsevier, vol. 174(1), pages 15-26.
    12. Jonathan Gruber & Maria Owings, 1996. "Physician Financial Incentives and Cesarean Section Delivery," RAND Journal of Economics, The RAND Corporation, vol. 27(1), pages 99-123, Spring.
    13. Tobias Berg & Valentin Burg & Ana Gombović & Manju Puri, 2020. "On the Rise of FinTechs: Credit Scoring Using Digital Footprints," The Review of Financial Studies, Society for Financial Studies, vol. 33(7), pages 2845-2897.
    14. Sendhil Mullainathan & Ziad Obermeyer, 2022. "Diagnosing Physician Error: A Machine Learning Approach to Low-Value Health Care [“The Determinants of Productivity in Medical Testing: Intensity and Allocation of Care,”]," The Quarterly Journal of Economics, Oxford University Press, vol. 137(2), pages 679-727.
    15. Toru Kitagawa & Aleksey Tetenov, 2018. "Who Should Be Treated? Empirical Welfare Maximization Methods for Treatment Choice," Econometrica, Econometric Society, vol. 86(2), pages 591-616, March.
    16. Andreas Fuster & Matthew Plosser & Philipp Schnabl & James Vickery, 2019. "The Role of Technology in Mortgage Lending," The Review of Financial Studies, Society for Financial Studies, vol. 32(5), pages 1854-1899.
    17. Janet Currie & W. Bentley MacLeod, 2017. "Diagnosing Expertise: Human Capital, Decision Making, and Performance among Physicians," Journal of Labor Economics, University of Chicago Press, vol. 35(1), pages 1-43.
    18. Gruber, Jon & Kim, John & Mayzlin, Dina, 1999. "Physician fees and procedure intensity: the case of cesarean delivery," Journal of Health Economics, Elsevier, vol. 18(4), pages 473-490, August.
    19. Boris Vallée & Yao Zeng, 2019. "Marketplace Lending: A New Banking Paradigm?," The Review of Financial Studies, Society for Financial Studies, vol. 32(5), pages 1939-1982.
    20. Susan Athey & Stefan Wager, 2021. "Policy Learning With Observational Data," Econometrica, Econometric Society, vol. 89(1), pages 133-161, January.
    21. Charles F. Manski, 2018. "Credible ecological inference for medical decisions with personalized risk assessment," Quantitative Economics, Econometric Society, vol. 9(2), pages 541-569, July.
    22. David C Chan & Matthew Gentzkow & Chuan Yu, 2022. "Selection with Variation in Diagnostic Skill: Evidence from Radiologists [The Determinants of Productivity in Medical Testing: Intensity and Allocation of Care]," The Quarterly Journal of Economics, Oxford University Press, vol. 137(2), pages 729-783.
    23. Scott Mayer McKinney & Marcin Sieniek & Varun Godbole & Jonathan Godwin & Natasha Antropova & Hutan Ashrafian & Trevor Back & Mary Chesus & Greg S. Corrado & Ara Darzi & Mozziyar Etemadi & Florencia G, 2020. "Addendum: International evaluation of an AI system for breast cancer screening," Nature, Nature, vol. 586(7829), pages 19-19, October.
    24. Ashesh Rambachan & Jon Kleinberg & Jens Ludwig & Sendhil Mullainathan, 2020. "An Economic Perspective on Algorithmic Fairness," AEA Papers and Proceedings, American Economic Association, vol. 110, pages 91-95, May.
    25. Ziyi Wang & Lijia Wei & Lian Xue, 2024. "Overcoming Medical Overuse with AI Assistance: An Experimental Investigation," Papers 2405.10539, arXiv.org.
    26. Michelle Vaccaro & Abdullah Almaatouq & Thomas Malone, 2024. "When combinations of humans and AI are useful: A systematic review and meta-analysis," Nature Human Behaviour, Nature, vol. 8(12), pages 2293-2303, December.
    27. Jon Kleinberg & Himabindu Lakkaraju & Jure Leskovec & Jens Ludwig & Sendhil Mullainathan, 2018. "Human Decisions and Machine Predictions," The Quarterly Journal of Economics, President and Fellows of Harvard College, vol. 133(1), pages 237-293.
    Full references (including those not matched with items on IDEAS)

    Most related items

    These are the items that most often cite the same works as this one and are cited by the same works as this one.
    1. Marie-Pierre Dargnies & Rustamdjan Hakimov & Dorothea Kübler, 2025. "Behavioral Measures Improve AI Hiring: A Field Experiment," Rationality and Competition Discussion Paper Series 532, CRC TRR 190 Rationality and Competition.
    2. Andreas Fuster & Paul Goldsmith‐Pinkham & Tarun Ramadorai & Ansgar Walther, 2022. "Predictably Unequal? The Effects of Machine Learning on Credit Markets," Journal of Finance, American Finance Association, vol. 77(1), pages 5-47, February.
    3. Hannes Ullrich & Michael Allan Ribers, 2023. "Machine predictions and human decisions with variation in payoffs and skill: the case of antibiotic prescribing," Berlin School of Economics Discussion Papers 0027, Berlin School of Economics.
    4. Majd Oteibi & Adam Tamimi & Kaneez Abbas & Gabriel Tamimi & Danesh Khazaei & Hadi Khazaei, 2024. "Advancing Digital Health using AI and Machine Learning Solutions for Early Ultrasonic Detection of Breast Disorders in Women," International Journal of Research and Scientific Innovation, International Journal of Research and Scientific Innovation (IJRSI), vol. 11(11), pages 518-527, November.
    5. Barili, Emilia & Bertoli, Paola & Grembi, Veronica, 2021. "Fee equalization and appropriate health care," Economics & Human Biology, Elsevier, vol. 41(C).
    6. Manski, Charles F., 2023. "Probabilistic prediction for binary treatment choice: With focus on personalized medicine," Journal of Econometrics, Elsevier, vol. 234(2), pages 647-663.
    7. Garbero, Alessandra & Sakos, Grayson & Cerulli, Giovanni, 2023. "Towards data-driven project design: Providing optimal treatment rules for development projects," Socio-Economic Planning Sciences, Elsevier, vol. 89(C).
    8. Christian Posso & Jorge Tamayo & Arlen Guarin & Estefania Saravia, 2024. "Luck of the Draw: The Causal Effect of Physicians on Birth Outcomes," Borradores de Economia 1269, Banco de la Republica de Colombia.
    9. Nan Liu & Yanbo Liu & Yuya Sasaki & Yuanyuan Wan, 2025. "Nonparametric Uniform Inference in Binary Classification and Policy Values," Working Papers tecipa-811, University of Toronto, Department of Economics.
    10. Michael Allan Ribers & Hannes Ullrich, 2024. "Complementarities between algorithmic and human decision-making: The case of antibiotic prescribing," Quantitative Marketing and Economics (QME), Springer, vol. 22(4), pages 445-483, December.
    11. Venkat Ram Reddy Ganuthula & Krishna Kumar Balaraman, 2025. "The Paradox of Professional Input: How Expert Collaboration with AI Systems Shapes Their Future Value," Papers 2504.12654, arXiv.org.
    12. Heike Hennig‐Schmidt & Hendrik Jürges & Daniel Wiesen, 2019. "Dishonesty in health care practice: A behavioral experiment on upcoding in neonatology," Health Economics, John Wiley & Sons, Ltd., vol. 28(3), pages 319-338, March.
    13. David Card & Alessandra Fenizia & David Silver, 2023. "The Health Impacts of Hospital Delivery Practices," American Economic Journal: Economic Policy, American Economic Association, vol. 15(2), pages 42-81, May.
    14. Tobias Berg & Andreas Fuster & Manju Puri, 2022. "FinTech Lending," Annual Review of Financial Economics, Annual Reviews, vol. 14(1), pages 187-207, November.
    15. Battiston, Pietro & Gamba, Simona & Santoro, Alessandro, 2024. "Machine learning and the optimization of prediction-based policies," Technological Forecasting and Social Change, Elsevier, vol. 199(C).
    16. Maria De-Arteaga & Vincent Jeanselme & Artur Dubrawski & Alexandra Chouldechova, 2025. "Leveraging Expert Consistency to Improve Algorithmic Decision Support," Management Science, INFORMS, vol. 71(12), pages 10465-10485, December.
    17. Barili, E; & Bertoli, P; & Grembi, V;, 2020. "Title: Fees equalization and Appropriate Health Care," Health, Econometrics and Data Group (HEDG) Working Papers 20/09, HEDG, c/o Department of Economics, University of York.
    18. Yuming Jiang & Zhicheng Zhang & Wei Wang & Weicai Huang & Chuanli Chen & Sujuan Xi & M. Usman Ahmad & Yulan Ren & Shengtian Sang & Jingjing Xie & Jen-Yeu Wang & Wenjun Xiong & Tuanjie Li & Zhen Han & , 2023. "Biology-guided deep learning predicts prognosis and cancer immunotherapy response," Nature Communications, Nature, vol. 14(1), pages 1-16, December.
    19. Juan Carlos Perdomo, 2023. "The Relative Value of Prediction in Algorithmic Decision Making," Papers 2312.08511, arXiv.org, revised May 2024.
    20. Davide Viviano & Jess Rudder, 2020. "Policy design in experiments with unknown interference," Papers 2011.08174, arXiv.org, revised May 2024.

    More about this item

    NEP fields

    This paper has been announced in the following NEP Reports:

    Statistics

    Access and download statistics

    Corrections

    All material on this site has been provided by the respective publishers and authors. You can help correct errors and omissions. When requesting a correction, please mention this item's handle: RePEc:arx:papers:2306.11689. See general information about how to correct material in RePEc.

    If you have authored this item and are not yet registered with RePEc, we encourage you to do it here. This allows to link your profile to this item. It also allows you to accept potential citations to this item that we are uncertain about.

    If CitEc recognized a bibliographic reference but did not link an item in RePEc to it, you can help with this form .

    If you know of missing items citing this one, you can help us creating those links by adding the relevant references in the same way as above, for each refering item. If you are a registered author of this item, you may also want to check the "citations" tab in your RePEc Author Service profile, as there may be some citations waiting for confirmation.

    For technical questions regarding this item, or to correct its authors, title, abstract, bibliographic or download information, contact: arXiv administrators (email available below). General contact details of provider: http://arxiv.org/ .

    Please note that corrections may take a couple of weeks to filter through the various RePEc services.

    IDEAS is a RePEc service. RePEc uses bibliographic data supplied by the respective publishers.