Statistical Tests for Replacing Human Decision Makers with Algorithms

Statistical Tests for Replacing Human Decision Makers with Algorithms

Author

Listed:

Kai Feng
Han Hong
Ke Tang
Jingyuan Wang

Registered:

Ke Tang

Abstract

This paper proposes a statistical framework of using artificial intelligence to improve human decision making. The performance of each human decision maker is benchmarked against that of machine predictions. We replace the diagnoses made by a subset of the decision makers with the recommendation from the machine learning algorithm. We apply both a heuristic frequentist approach and a Bayesian posterior loss function approach to abnormal birth detection using a nationwide dataset of doctor diagnoses from prepregnancy checkups of reproductive age couples and pregnancy outcomes. We find that our algorithm on a test dataset results in a higher overall true positive rate and a lower false positive rate than the diagnoses made by doctors only.

Suggested Citation

Kai Feng & Han Hong & Ke Tang & Jingyuan Wang, 2023. "Statistical Tests for Replacing Human Decision Makers with Algorithms," Papers 2306.11689, arXiv.org, revised Dec 2024.

Handle: RePEc:arx:papers:2306.11689

Download full text from publisher

Other versions of this item:

Kai Feng & Han Hong & Ke Tang & Jingyuan Wang, 2025. "Statistical Tests for Replacing Human Decision Makers with Algorithms," Management Science, INFORMS, vol. 71(11), pages 9145-9170, November.

References listed on IDEAS

Kohei Kawaguchi, 2021. "When Will Workers Follow an Algorithm? A Field Experiment with a Retail Business," Management Science, INFORMS, vol. 67(3), pages 1670-1695, March.
Jiaming Zeng & Berk Ustun & Cynthia Rudin, 2017. "Interpretable classification models for recidivism prediction," Journal of the Royal Statistical Society Series A, Royal Statistical Society, vol. 180(3), pages 689-722, June.
Eric Mbakop & Max Tabord‐Meehan, 2021. "Model Selection for Treatment Choice: Penalized Welfare Maximization," Econometrica, Econometric Society, vol. 89(2), pages 825-848, March.
- Eric Mbakop & Max Tabord-Meehan, 2016. "Model Selection for Treatment Choice: Penalized Welfare Maximization," Papers 1609.03167, arXiv.org, revised Dec 2020.
Erin M. Johnson & M. Marit Rehavi, 2016. "Physicians Treating Physicians: Information and Incentives in Childbirth," American Economic Journal: Economic Policy, American Economic Association, vol. 8(1), pages 115-141, February.
- Erin M. Johnson & M. Marit Rehavi, 2013. "Physicians Treating Physicians: Information and Incentives in Childbirth," NBER Working Papers 19242, National Bureau of Economic Research, Inc.
Aaron Chalfin & Oren Danieli & Andrew Hillis & Zubin Jelveh & Michael Luca & Jens Ludwig & Sendhil Mullainathan, 2016. "Productivity and Selection of Human Capital with Machine Learning," American Economic Review, American Economic Association, vol. 106(5), pages 124-127, May.
Sendhil Mullainathan & Jann Spiess, 2017. "Machine Learning: An Applied Econometric Approach," Journal of Economic Perspectives, American Economic Association, vol. 31(2), pages 87-106, Spring.
Andre Esteva & Brett Kuprel & Roberto A. Novoa & Justin Ko & Susan M. Swetter & Helen M. Blau & Sebastian Thrun, 2017. "Correction: Corrigendum: Dermatologist-level classification of skin cancer with deep neural networks," Nature, Nature, vol. 546(7660), pages 686-686, June.
Scott Mayer McKinney & Marcin Sieniek & Varun Godbole & Jonathan Godwin & Natasha Antropova & Hutan Ashrafian & Trevor Back & Mary Chesus & Greg S. Corrado & Ara Darzi & Mozziyar Etemadi & Florencia G, 2020. "International evaluation of an AI system for breast cancer screening," Nature, Nature, vol. 577(7788), pages 89-94, January.
Andre Esteva & Brett Kuprel & Roberto A. Novoa & Justin Ko & Susan M. Swetter & Helen M. Blau & Sebastian Thrun, 2017. "Dermatologist-level classification of skin cancer with deep neural networks," Nature, Nature, vol. 542(7639), pages 115-118, February.
Apaar Sadhwani & Kay Giesecke & Justin Sirignano, 2021. "Deep Learning for Mortgage Risk [The Subprime Virus]," Journal of Financial Econometrics, Oxford University Press, vol. 19(2), pages 313-368.
Elliott, Graham & Lieli, Robert P., 2013. "Predicting binary outcomes," Journal of Econometrics, Elsevier, vol. 174(1), pages 15-26.
Jonathan Gruber & Maria Owings, 1996. "Physician Financial Incentives and Cesarean Section Delivery," RAND Journal of Economics, The RAND Corporation, vol. 27(1), pages 99-123, Spring.
- Jonathan Gruber & Maria Owings, 1994. "Physician Financial Incentives and Cesarean Section Delivery," NBER Working Papers 4933, National Bureau of Economic Research, Inc.
Tobias Berg & Valentin Burg & Ana Gombović & Manju Puri, 2020. "On the Rise of FinTechs: Credit Scoring Using Digital Footprints," The Review of Financial Studies, Society for Financial Studies, vol. 33(7), pages 2845-2897.
- Tobias Berg & Valentin Burg & Ana Gombović & Manju Puri, 2018. "On the Rise of FinTechs – Credit Scoring using Digital Footprints," NBER Working Papers 24551, National Bureau of Economic Research, Inc.
Sendhil Mullainathan & Ziad Obermeyer, 2022. "Diagnosing Physician Error: A Machine Learning Approach to Low-Value Health Care [“The Determinants of Productivity in Medical Testing: Intensity and Allocation of Care,”]," The Quarterly Journal of Economics, Oxford University Press, vol. 137(2), pages 679-727.
Toru Kitagawa & Aleksey Tetenov, 2018. "Who Should Be Treated? Empirical Welfare Maximization Methods for Treatment Choice," Econometrica, Econometric Society, vol. 86(2), pages 591-616, March.
- Toru Kitagawa & Aleksey Tetenov, 2015. "Who should be treated? Empirical welfare maximization methods for treatment choice," CeMMAP working papers CWP10/15, Centre for Microdata Methods and Practice, Institute for Fiscal Studies.
- Toru Kitagawa & Aleksey Tetenov, 2017. "Who should be treated? Empirical welfare maximization methods for treatment choice," CeMMAP working papers CWP24/17, Centre for Microdata Methods and Practice, Institute for Fiscal Studies.
- Toru Kitagawa & Aleksey Tetenov, 2015. "Who should be Treated? Empirical Welfare Maximization Methods for Treatment Choice," Carlo Alberto Notebooks 402, Collegio Carlo Alberto.
Andreas Fuster & Matthew Plosser & Philipp Schnabl & James Vickery, 2019. "The Role of Technology in Mortgage Lending," The Review of Financial Studies, Society for Financial Studies, vol. 32(5), pages 1854-1899.
- Schnabl, Philipp & Vickery, James & Plosser, Matthew, 2018. "The Role of Technology in Mortgage Lending," CEPR Discussion Papers 12961, Centre for Economic Policy Research.
- Andreas Fuster & Matthew Plosser & Philipp Schnabl & James Vickery, 2018. "The role of technology in mortgage lending," Staff Reports 836, Federal Reserve Bank of New York.
- Andreas Fuster & Matthew Plosser & Philipp Schnabl & James Vickery, 2018. "The Role of Technology in Mortgage Lending," NBER Working Papers 24500, National Bureau of Economic Research, Inc.
Janet Currie & W. Bentley MacLeod, 2017. "Diagnosing Expertise: Human Capital, Decision Making, and Performance among Physicians," Journal of Labor Economics, University of Chicago Press, vol. 35(1), pages 1-43.
- Janet Currie & W. Bentley MacLeod, 2013. "Diagnosing Expertise: Human Capital, Decision Making and Performance Among Physicians," NBER Working Papers 18977, National Bureau of Economic Research, Inc.
Gruber, Jon & Kim, John & Mayzlin, Dina, 1999. "Physician fees and procedure intensity: the case of cesarean delivery," Journal of Health Economics, Elsevier, vol. 18(4), pages 473-490, August.
Boris Vallée & Yao Zeng, 2019. "Marketplace Lending: A New Banking Paradigm?," The Review of Financial Studies, Society for Financial Studies, vol. 32(5), pages 1939-1982.
Susan Athey & Stefan Wager, 2021. "Policy Learning With Observational Data," Econometrica, Econometric Society, vol. 89(1), pages 133-161, January.
- Susan Athey & Stefan Wager, 2017. "Policy Learning with Observational Data," Papers 1702.02896, arXiv.org, revised Sep 2020.
Charles F. Manski, 2018. "Credible ecological inference for medical decisions with personalized risk assessment," Quantitative Economics, Econometric Society, vol. 9(2), pages 541-569, July.
Scott Mayer McKinney & Marcin Sieniek & Varun Godbole & Jonathan Godwin & Natasha Antropova & Hutan Ashrafian & Trevor Back & Mary Chesus & Greg S. Corrado & Ara Darzi & Mozziyar Etemadi & Florencia G, 2020. "Addendum: International evaluation of an AI system for breast cancer screening," Nature, Nature, vol. 586(7829), pages 19-19, October.
David C Chan & Matthew Gentzkow & Chuan Yu, 2022. "Selection with Variation in Diagnostic Skill: Evidence from Radiologists [The Determinants of Productivity in Medical Testing: Intensity and Allocation of Care]," The Quarterly Journal of Economics, Oxford University Press, vol. 137(2), pages 729-783.
Ashesh Rambachan & Jon Kleinberg & Jens Ludwig & Sendhil Mullainathan, 2020. "An Economic Perspective on Algorithmic Fairness," AEA Papers and Proceedings, American Economic Association, vol. 110, pages 91-95, May.
Ziyi Wang & Lijia Wei & Lian Xue, 2024. "Overcoming Medical Overuse with AI Assistance: An Experimental Investigation," Papers 2405.10539, arXiv.org.
Michelle Vaccaro & Abdullah Almaatouq & Thomas Malone, 2024. "When combinations of humans and AI are useful: A systematic review and meta-analysis," Nature Human Behaviour, Nature, vol. 8(12), pages 2293-2303, December.
Jon Kleinberg & Himabindu Lakkaraju & Jure Leskovec & Jens Ludwig & Sendhil Mullainathan, 2018. "Human Decisions and Machine Predictions," The Quarterly Journal of Economics, President and Fellows of Harvard College, vol. 133(1), pages 237-293.
- Jon Kleinberg & Himabindu Lakkaraju & Jure Leskovec & Jens Ludwig & Sendhil Mullainathan, 2017. "Human Decisions and Machine Predictions," NBER Working Papers 23180, National Bureau of Economic Research, Inc.

Full references (including those not matched with items on IDEAS)

Most related items

These are the items that most often cite the same works as this one and are cited by the same works as this one.

Marie-Pierre Dargnies & Rustamdjan Hakimov & Dorothea Kübler, 2025. "Behavioral Measures Improve AI Hiring: A Field Experiment," Rationality and Competition Discussion Paper Series 532, CRC TRR 190 Rationality and Competition.
Andreas Fuster & Paul Goldsmith‐Pinkham & Tarun Ramadorai & Ansgar Walther, 2022. "Predictably Unequal? The Effects of Machine Learning on Credit Markets," Journal of Finance, American Finance Association, vol. 77(1), pages 5-47, February.
- Goldsmith-Pinkham, Paul & Walther, Ansgar, 2017. "Predictably Unequal? The Effects of Machine Learning on Credit Markets," CEPR Discussion Papers 12448, Centre for Economic Policy Research.
Hannes Ullrich & Michael Allan Ribers, 2023. "Machine predictions and human decisions with variation in payoffs and skill: the case of antibiotic prescribing," Berlin School of Economics Discussion Papers 0027, Berlin School of Economics.
Majd Oteibi & Adam Tamimi & Kaneez Abbas & Gabriel Tamimi & Danesh Khazaei & Hadi Khazaei, 2024. "Advancing Digital Health using AI and Machine Learning Solutions for Early Ultrasonic Detection of Breast Disorders in Women," International Journal of Research and Scientific Innovation, International Journal of Research and Scientific Innovation (IJRSI), vol. 11(11), pages 518-527, November.
Barili, Emilia & Bertoli, Paola & Grembi, Veronica, 2021. "Fee equalization and appropriate health care," Economics & Human Biology, Elsevier, vol. 41(C).
- Emilia Barili & Paola Bertoli & Veronica Grembi, 2020. "Fee Equalization and Appropriate Health Care," CERGE-EI Working Papers wp664, The Center for Economic Research and Graduate Education - Economics Institute, Prague.
Manski, Charles F., 2023. "Probabilistic prediction for binary treatment choice: With focus on personalized medicine," Journal of Econometrics, Elsevier, vol. 234(2), pages 647-663.
- Charles F. Manski, 2021. "Probabilistic Prediction for Binary Treatment Choice: with Focus on Personalized Medicine," NBER Working Papers 29358, National Bureau of Economic Research, Inc.
- Charles F. Manski, 2021. "Probabilistic Prediction for Binary Treatment Choice: with focus on personalized medicine," Papers 2110.00864, arXiv.org.
Garbero, Alessandra & Sakos, Grayson & Cerulli, Giovanni, 2023. "Towards data-driven project design: Providing optimal treatment rules for development projects," Socio-Economic Planning Sciences, Elsevier, vol. 89(C).
- Garbero, Alessandra & Sakos, Grayson & Cerulli, Giovanni, 2021. "Towards Data-driven Project design: Providing Optimal Treatment Rules for Development Projects," 2021 Annual Meeting, August 1-3, Austin, Texas 314016, Agricultural and Applied Economics Association.
Christian Posso & Jorge Tamayo & Arlen Guarin & Estefania Saravia, 2024. "Luck of the Draw: The Causal Effect of Physicians on Birth Outcomes," Borradores de Economia 1269, Banco de la Republica de Colombia.
- Posso,Christian & Tamayo,Jorge & Guarin Galeano, Arlen Yahir & Saravia,Estefania, 2025. "Luck of the Draw : The Causal Effect of Physicians on Birth Outcomes," Policy Research Working Paper Series 11143, The World Bank.
Nan Liu & Yanbo Liu & Yuya Sasaki & Yuanyuan Wan, 2025. "Nonparametric Uniform Inference in Binary Classification and Policy Values," Working Papers tecipa-811, University of Toronto, Department of Economics.
- Nan Liu & Yanbo Liu & Yuya Sasaki & Yuanyuan Wan, 2025. "Nonparametric Uniform Inference in Binary Classification and Policy Values," Papers 2511.14700, arXiv.org, revised Dec 2025.
Michael Allan Ribers & Hannes Ullrich, 2024. "Complementarities between algorithmic and human decision-making: The case of antibiotic prescribing," Quantitative Marketing and Economics (QME), Springer, vol. 22(4), pages 445-483, December.
Venkat Ram Reddy Ganuthula & Krishna Kumar Balaraman, 2025. "The Paradox of Professional Input: How Expert Collaboration with AI Systems Shapes Their Future Value," Papers 2504.12654, arXiv.org.
Heike Hennig‐Schmidt & Hendrik Jürges & Daniel Wiesen, 2019. "Dishonesty in health care practice: A behavioral experiment on upcoding in neonatology," Health Economics, John Wiley & Sons, Ltd., vol. 28(3), pages 319-338, March.
- Hennig-Schmidt, Heike & Jürges, Hendrik & Wiesen, Daniel, 2018. "Dishonesty in healthcare practice: A behavioral experiment on upcoding in neonatology," HERO Online Working Paper Series 2018:3, University of Oslo, Health Economics Research Programme.
David Card & Alessandra Fenizia & David Silver, 2023. "The Health Impacts of Hospital Delivery Practices," American Economic Journal: Economic Policy, American Economic Association, vol. 15(2), pages 42-81, May.
- David Card & Alessandra Fenizia & David Silver, 2019. "The Health Impacts of Hospital Delivery Practices," NBER Working Papers 25986, National Bureau of Economic Research, Inc.
- David Card & Alessandra Fenizia & David Silver, 2020. "The Health Impacts of Hospital Delivery Practices," Working Papers 2020-73, Princeton University. Economics Department..
Tobias Berg & Andreas Fuster & Manju Puri, 2022. "FinTech Lending," Annual Review of Financial Economics, Annual Reviews, vol. 14(1), pages 187-207, November.
- Tobias Berg & Andreas Fuster & Manju Puri, 2021. "FinTech Lending," Swiss Finance Institute Research Paper Series 21-72, Swiss Finance Institute.
- Tobias Berg & Andreas Fuster & Manju Puri, 2021. "FinTech Lending," NBER Working Papers 29421, National Bureau of Economic Research, Inc.
- Berg, Tobias & Puri, Manju, 2021. "FinTech Lending," CEPR Discussion Papers 16668, Centre for Economic Policy Research.
Battiston, Pietro & Gamba, Simona & Santoro, Alessandro, 2024. "Machine learning and the optimization of prediction-based policies," Technological Forecasting and Social Change, Elsevier, vol. 199(C).
Maria De-Arteaga & Vincent Jeanselme & Artur Dubrawski & Alexandra Chouldechova, 2025. "Leveraging Expert Consistency to Improve Algorithmic Decision Support," Management Science, INFORMS, vol. 71(12), pages 10465-10485, December.
Barili, E; & Bertoli, P; & Grembi, V;, 2020. "Title: Fees equalization and Appropriate Health Care," Health, Econometrics and Data Group (HEDG) Working Papers 20/09, HEDG, c/o Department of Economics, University of York.
Yuming Jiang & Zhicheng Zhang & Wei Wang & Weicai Huang & Chuanli Chen & Sujuan Xi & M. Usman Ahmad & Yulan Ren & Shengtian Sang & Jingjing Xie & Jen-Yeu Wang & Wenjun Xiong & Tuanjie Li & Zhen Han & , 2023. "Biology-guided deep learning predicts prognosis and cancer immunotherapy response," Nature Communications, Nature, vol. 14(1), pages 1-16, December.
Juan Carlos Perdomo, 2023. "The Relative Value of Prediction in Algorithmic Decision Making," Papers 2312.08511, arXiv.org, revised May 2024.
Davide Viviano & Jess Rudder, 2020. "Policy design in experiments with unknown interference," Papers 2011.08174, arXiv.org, revised Jun 2026.

More about this item

NEP fields

This paper has been announced in the following NEP Reports:

NEP-AIN-2023-07-24 (Artificial Intelligence)
NEP-BIG-2023-07-24 (Big Data)
NEP-CMP-2023-07-24 (Computational Economics)

Statistics

Access and download statistics

Corrections

All material on this site has been provided by the respective publishers and authors. You can help correct errors and omissions. When requesting a correction, please mention this item's handle: RePEc:arx:papers:2306.11689. See general information about how to correct material in RePEc.

If you have authored this item and are not yet registered with RePEc, we encourage you to do it here. This allows to link your profile to this item. It also allows you to accept potential citations to this item that we are uncertain about.

If CitEc recognized a bibliographic reference but did not link an item in RePEc to it, you can help with this form .

If you know of missing items citing this one, you can help us creating those links by adding the relevant references in the same way as above, for each refering item. If you are a registered author of this item, you may also want to check the "citations" tab in your RePEc Author Service profile, as there may be some citations waiting for confirmation.

For technical questions regarding this item, or to correct its authors, title, abstract, bibliographic or download information, contact: arXiv administrators (email available below). General contact details of provider: https://arxiv.org/ .

Please note that corrections may take a couple of weeks to filter through the various RePEc services.

IDEAS is a RePEc service. RePEc uses bibliographic data supplied by the respective publishers.

Browse Econ Literature

More features

Statistical Tests for Replacing Human Decision Makers with Algorithms

Author

Abstract

Suggested Citation

Download full text from publisher

Other versions of this item:

References listed on IDEAS

Most related items

More about this item

NEP fields

Statistics

Corrections

More services and features

MyIDEAS

Author registration

Rankings

RePEc Genealogy

RePEc Biblio

MPRA

New papers by email

EconAcademics

Plagiarism

About RePEc

RePEc home

Blog

Help/FAQ

RePEc team

Participating archives

Privacy statement

Help us

Corrections

Volunteers

Get papers listed

Open a RePEc archive

Get RePEc data