IDEAS home Printed from https://ideas.repec.org/p/arx/papers/2509.14396.html
   My bibliography  Save this paper

Friend or Foe: Delegating to an AI Whose Alignment is Unknown

Author

Listed:
  • Drew Fudenberg
  • Annie Liang

Abstract

AI systems have the potential to improve decision-making, but decision makers face the risk that the AI may be misaligned with their objectives. We study this problem in the context of a treatment decision, where a designer decides which patient attributes to reveal to an AI before receiving a prediction of the patient's need for treatment. Providing the AI with more information increases the benefits of an aligned AI but also amplifies the harm from a misaligned one. We characterize how the designer should select attributes to balance these competing forces, depending on their beliefs about the AI's reliability. We show that the designer should optimally disclose attributes that identify \emph{rare} segments of the population in which the need for treatment is high, and pool the remaining patients.

Suggested Citation

  • Drew Fudenberg & Annie Liang, 2025. "Friend or Foe: Delegating to an AI Whose Alignment is Unknown," Papers 2509.14396, arXiv.org.
  • Handle: RePEc:arx:papers:2509.14396
    as

    Download full text from publisher

    File URL: http://arxiv.org/pdf/2509.14396
    File Function: Latest version
    Download Restriction: no
    ---><---

    References listed on IDEAS

    as
    1. Stefan Wager & Susan Athey, 2018. "Estimation and Inference of Heterogeneous Treatment Effects using Random Forests," Journal of the American Statistical Association, Taylor & Francis Journals, vol. 113(523), pages 1228-1242, July.
    2. Vijay Krishna & John Morgan, 2001. "A Model of Expertise," The Quarterly Journal of Economics, President and Fellows of Harvard College, vol. 116(2), pages 747-775.
    3. Bernardino Romera-Paredes & Mohammadamin Barekatain & Alexander Novikov & Matej Balog & M. Pawan Kumar & Emilien Dupont & Francisco J. R. Ruiz & Jordan S. Ellenberg & Pengming Wang & Omar Fawzi & Push, 2024. "Mathematical discoveries from program search with large language models," Nature, Nature, vol. 625(7995), pages 468-475, January.
    4. Alex Davies & Petar Veličković & Lars Buesing & Sam Blackwell & Daniel Zheng & Nenad Tomašev & Richard Tanburn & Peter Battaglia & Charles Blundell & András Juhász & Marc Lackenby & Geordie Williamson, 2021. "Advancing mathematics by guiding human intuition with AI," Nature, Nature, vol. 600(7887), pages 70-74, December.
    5. Josh Abramson & Jonas Adler & Jack Dunger & Richard Evans & Tim Green & Alexander Pritzel & Olaf Ronneberger & Lindsay Willmore & Andrew J. Ballard & Joshua Bambrick & Sebastian W. Bodenstein & David , 2024. "Accurate structure prediction of biomolecular interactions with AlphaFold 3," Nature, Nature, vol. 630(8016), pages 493-500, June.
    6. Annie Liang & Jay Lu & Xiaosheng Mu & Kyohei Okumura, 2021. "Algorithm Design: A Fairness-Accuracy Frontier," Papers 2112.09975, arXiv.org, revised May 2024.
    Full references (including those not matched with items on IDEAS)

    Most related items

    These are the items that most often cite the same works as this one and are cited by the same works as this one.
    1. Petersen, Alexander Michael & Arroyave, Felber J. & Pammolli, Fabio, 2025. "The disruption index suffers from citation inflation: Re-analysis of temporal CD trend and relationship with team size reveal discrepancies," Journal of Informetrics, Elsevier, vol. 19(1).
    2. Persson, Petra, 2018. "Attention manipulation and information overload," Behavioural Public Policy, Cambridge University Press, vol. 2(1), pages 78-106, May.
    3. Lechner, Michael, 2018. "Modified Causal Forests for Estimating Heterogeneous Causal Effects," IZA Discussion Papers 12040, Institute of Labor Economics (IZA).
    4. William Arbour, 2021. "Can Recidivism be Prevented from Behind Bars? Evidence from a Behavioral Program," Working Papers tecipa-683, University of Toronto, Department of Economics.
    5. Alexandre Belloni & Victor Chernozhukov & Denis Chetverikov & Christian Hansen & Kengo Kato, 2018. "High-dimensional econometrics and regularized GMM," CeMMAP working papers CWP35/18, Centre for Microdata Methods and Practice, Institute for Fiscal Studies.
    6. Dimitris Bertsimas & Agni Orfanoudaki & Rory B. Weiner, 2020. "Personalized treatment for coronary artery disease patients: a machine learning approach," Health Care Management Science, Springer, vol. 23(4), pages 482-506, December.
    7. Justin Whitehouse & Morgane Austern & Vasilis Syrgkanis, 2025. "Inference on Optimal Policy Values and Other Irregular Functionals via Smoothing," Papers 2507.11780, arXiv.org.
    8. Nicolaj N. Mühlbach, 2020. "Tree-based Synthetic Control Methods: Consequences of moving the US Embassy," CREATES Research Papers 2020-04, Department of Economics and Business Economics, Aarhus University.
    9. Kyle Colangelo & Ying-Ying Lee, 2019. "Double debiased machine learning nonparametric inference with continuous treatments," CeMMAP working papers CWP72/19, Centre for Microdata Methods and Practice, Institute for Fiscal Studies.
    10. Shonosuke Sugasawa & Hisashi Noma, 2021. "Efficient screening of predictive biomarkers for individual treatment selection," Biometrics, The International Biometric Society, vol. 77(1), pages 249-257, March.
    11. Pantelis Livanos & Choy Kriechbaum & Sophia Remers & Arvid Herrmann & Sabine Müller, 2025. "Kinesin-12 POK2 polarization is a prerequisite for a fully functional division site and aids cell plate positioning," Nature Communications, Nature, vol. 16(1), pages 1-17, December.
    12. Ruoxuan Xiong & Allison Koenecke & Michael Powell & Zhu Shen & Joshua T. Vogelstein & Susan Athey, 2021. "Federated Causal Inference in Heterogeneous Observational Data," Papers 2107.11732, arXiv.org, revised Apr 2023.
    13. Stephen Jarvis & Olivier Deschenes & Akshaya Jha, 2022. "The Private and External Costs of Germany’s Nuclear Phase-Out," Journal of the European Economic Association, European Economic Association, vol. 20(3), pages 1311-1346.
    14. Feltenstein, Andrew & Lagunoff, Roger, 2005. "International versus domestic auditing of bank solvency," Journal of International Economics, Elsevier, vol. 67(1), pages 73-96, September.
    15. Arne Henningsen & Guy Low & David Wuepper & Tobias Dalhaus & Hugo Storm & Dagim Belay & Stefan Hirsch, 2024. "Estimating Causal Effects with Observational Data: Guidelines for Agricultural and Applied Economists," IFRO Working Paper 2024/03, University of Copenhagen, Department of Food and Resource Economics.
    16. Vaccari, Federico, 2023. "Competition in costly talk," Journal of Economic Theory, Elsevier, vol. 213(C).
    17. Justin Riper & Arleth O. Martinez-Claros & Lie Wang & Hannah E. Schneiderman & Sweta Maheshwari & Monica C. Pillon, 2025. "CryoEM structure of the SLFN14 endoribonuclease reveals insight into RNA binding and cleavage," Nature Communications, Nature, vol. 16(1), pages 1-15, December.
    18. Bingnan Guo & Yuren Qian & Xinyan Guo & Hao Zhang, 2025. "Impact of Zero-Waste City Pilot Policies on Urban Energy Consumption Intensity: Causal Inference Based on Double Machine Learning," Sustainability, MDPI, vol. 17(11), pages 1-25, May.
    19. Hayakawa, Kazunobu & Keola, Souknilanh & Silaphet, Korrakoun & Yamanouchi, Kenta, 2022. "Estimating the impacts of international bridges on foreign firm locations: a machine learning approach," IDE Discussion Papers 847, Institute of Developing Economies, Japan External Trade Organization(JETRO).
    20. Chan, Jimmy & Suen, Wing, 2009. "Media as watchdogs: The role of news media in electoral competition," European Economic Review, Elsevier, vol. 53(7), pages 799-814, October.

    More about this item

    NEP fields

    This paper has been announced in the following NEP Reports:

    Statistics

    Access and download statistics

    Corrections

    All material on this site has been provided by the respective publishers and authors. You can help correct errors and omissions. When requesting a correction, please mention this item's handle: RePEc:arx:papers:2509.14396. See general information about how to correct material in RePEc.

    If you have authored this item and are not yet registered with RePEc, we encourage you to do it here. This allows to link your profile to this item. It also allows you to accept potential citations to this item that we are uncertain about.

    If CitEc recognized a bibliographic reference but did not link an item in RePEc to it, you can help with this form .

    If you know of missing items citing this one, you can help us creating those links by adding the relevant references in the same way as above, for each refering item. If you are a registered author of this item, you may also want to check the "citations" tab in your RePEc Author Service profile, as there may be some citations waiting for confirmation.

    For technical questions regarding this item, or to correct its authors, title, abstract, bibliographic or download information, contact: arXiv administrators (email available below). General contact details of provider: http://arxiv.org/ .

    Please note that corrections may take a couple of weeks to filter through the various RePEc services.

    IDEAS is a RePEc service. RePEc uses bibliographic data supplied by the respective publishers.