IDEAS home Printed from https://ideas.repec.org/p/arx/papers/2406.05972.html

Decision-Making Behavior Evaluation Framework for LLMs under Uncertain Context

Author

Listed:
  • Jingru Jia
  • Zehua Yuan
  • Junhao Pan
  • Paul E. McNamara
  • Deming Chen

Abstract

When making decisions under uncertainty, individuals often deviate from rational behavior, which can be evaluated across three dimensions: risk preference, probability weighting, and loss aversion. Given the widespread use of large language models (LLMs) in decision-making processes, it is crucial to assess whether their behavior aligns with human norms and ethical expectations or exhibits potential biases. Several empirical studies have investigated the rationality and social behavior performance of LLMs, yet their internal decision-making tendencies and capabilities remain inadequately understood. This paper proposes a framework, grounded in behavioral economics, to evaluate the decision-making behaviors of LLMs. Through a multiple-choice-list experiment, we estimate the degree of risk preference, probability weighting, and loss aversion in a context-free setting for three commercial LLMs: ChatGPT-4.0-Turbo, Claude-3-Opus, and Gemini-1.0-pro. Our results reveal that LLMs generally exhibit patterns similar to humans, such as risk aversion and loss aversion, with a tendency to overweight small probabilities. However, there are significant variations in the degree to which these behaviors are expressed across different LLMs. We also explore their behavior when embedded with socio-demographic features, uncovering significant disparities. For instance, when modeled with attributes of sexual minority groups or physical disabilities, Claude-3-Opus displays increased risk aversion, leading to more conservative choices. These findings underscore the need for careful consideration of the ethical implications and potential biases in deploying LLMs in decision-making scenarios. Therefore, this study advocates for developing standards and guidelines to ensure that LLMs operate within ethical boundaries while enhancing their utility in complex decision-making environments.

Suggested Citation

  • Jingru Jia & Zehua Yuan & Junhao Pan & Paul E. McNamara & Deming Chen, 2024. "Decision-Making Behavior Evaluation Framework for LLMs under Uncertain Context," Papers 2406.05972, arXiv.org, revised Oct 2024.
  • Handle: RePEc:arx:papers:2406.05972
    as

    Download full text from publisher

    File URL: http://arxiv.org/pdf/2406.05972
    File Function: Latest version
    Download Restriction: no
    ---><---

    References listed on IDEAS

    as
    1. Thomas Dohmen & Armin Falk & David Huffman & Uwe Sunde & Jürgen Schupp & Gert G. Wagner, 2011. "Individual Risk Attitudes: Measurement, Determinants, And Behavioral Consequences," Journal of the European Economic Association, European Economic Association, vol. 9(3), pages 522-550, June.
    2. Glenn Harrison & E. Rutström, 2009. "Expected utility theory and prospect theory: one wedding and a decent funeral," Experimental Economics, Springer;Economic Science Association, vol. 12(2), pages 133-158, June.
    3. Charles A. Holt & Susan K. Laury, 2002. "Risk Aversion and Incentive Effects," American Economic Review, American Economic Association, vol. 92(5), pages 1644-1655, December.
    4. Tomomi Tanaka & Colin F. Camerer & Quang Nguyen, 2010. "Risk and Time Preferences: Linking Experimental and Household Survey Data from Vietnam," American Economic Review, American Economic Association, vol. 100(1), pages 557-571, March.
    5. Steffen Andersen & Glenn W. Harrison & Morten I. Lau & E. Elisabet Rutström, 2008. "Eliciting Risk and Time Preferences," Econometrica, Econometric Society, vol. 76(3), pages 583-618, May.
    6. Daniel Kahneman & Amos Tversky, 2013. "Prospect Theory: An Analysis of Decision Under Risk," World Scientific Book Chapters, in: Leonard C MacLean & William T Ziemba (ed.), HANDBOOK OF THE FUNDAMENTALS OF FINANCIAL DECISION MAKING Part I, chapter 6, pages 99-127, World Scientific Publishing Co. Pte. Ltd..
    7. John A. List, 2003. "Does Market Experience Eliminate Market Anomalies?," The Quarterly Journal of Economics, President and Fellows of Harvard College, vol. 118(1), pages 41-71.
    8. Hans-Martin von Gaudecker & Arthur van Soest & Erik Wengstrom, 2011. "Heterogeneity in Risky Choice Behavior in a Broad Population," American Economic Review, American Economic Association, vol. 101(2), pages 664-694, April.
    9. Ko, Hyungjin & Lee, Jaewook, 2024. "Can ChatGPT improve investment decisions? From a portfolio management perspective," Finance Research Letters, Elsevier, vol. 64(C).
    10. Elaine M. Liu, 2013. "Time to Change What to Sow: Risk Preferences and Technology Adoption Decisions of Cotton Farmers in China," The Review of Economics and Statistics, MIT Press, vol. 95(4), pages 1386-1403, October.
    11. Daniel J. Benjamin & Sebastian A. Brown & Jesse M. Shapiro, 2013. "Who Is ‘Behavioral’? Cognitive Ability And Anomalous Preferences," Journal of the European Economic Association, European Economic Association, vol. 11(6), pages 1231-1255, December.
    12. Binswanger, Hans P, 1981. "Attitudes toward Risk: Theoretical Implications of an Experiment in Rural India," Economic Journal, Royal Economic Society, vol. 91(364), pages 867-890, December.
    13. Fulin Guo, 2023. "GPT in Game Theory Experiments," Papers 2305.05516, arXiv.org, revised Dec 2023.
    14. John J. Horton, 2023. "Large Language Models as Simulated Economic Agents: What Can We Learn from Homo Silicus?," NBER Working Papers 31122, National Bureau of Economic Research, Inc.
    15. Shijie Wu & Ozan Irsoy & Steven Lu & Vadim Dabravolski & Mark Dredze & Sebastian Gehrmann & Prabhanjan Kambadur & David Rosenberg & Gideon Mann, 2023. "BloombergGPT: A Large Language Model for Finance," Papers 2303.17564, arXiv.org, revised Dec 2023.
    16. John J. Horton, 2023. "Large Language Models as Simulated Economic Agents: What Can We Learn from Homo Silicus?," Papers 2301.07543, arXiv.org.
    Full references (including those not matched with items on IDEAS)

    Citations

    Citations are extracted by the CitEc Project, subscribe to its RSS feed for this item.
    as


    Cited by:

    1. Seung Jung Lee & Anne Lundgaard Hansen, 2025. "Financial Stability Implications of Generative AI: Taming the Animal Spirits," Finance and Economics Discussion Series 2025-090, Board of Governors of the Federal Reserve System (U.S.).
    2. Anne Lundgaard Hansen & Seung Jung Lee, 2025. "Financial Stability Implications of Generative AI: Taming the Animal Spirits," Papers 2510.01451, arXiv.org.

    Most related items

    These are the items that most often cite the same works as this one and are cited by the same works as this one.
    1. Galarza, Francisco, 2009. "Choices under Risk in Rural Peru," MPRA Paper 17708, University Library of Munich, Germany.
    2. Schleich, Joachim & Gassmann, Xavier & Meissner, Thomas & Faure, Corinne, 2019. "A large-scale test of the effects of time discounting, risk aversion, loss aversion, and present bias on household adoption of energy-efficient technologies," Energy Economics, Elsevier, vol. 80(C), pages 377-393.
    3. Charness, Gary & Gneezy, Uri & Imas, Alex, 2013. "Experimental methods: Eliciting risk preferences," Journal of Economic Behavior & Organization, Elsevier, vol. 87(C), pages 43-51.
    4. Julia Ihli, Hanna & Chiputwa, Brian & Winter, Etti & Gassner, Anja, 2022. "Risk and time preferences for participating in forest landscape restoration: The case of coffee farmers in Uganda," World Development, Elsevier, vol. 150(C).
    5. Insaf Bekir & Faten Doss, 2020. "Status quo bias and attitude towards risk: An experimental investigation," Managerial and Decision Economics, John Wiley & Sons, Ltd., vol. 41(5), pages 827-838, July.
    6. Galizzi, Matteo M. & Machado, Sara R. & Miniaci, Raffaele, 2016. "Temporal stability, cross-validity, and external validity of risk preferences measures: experimental evidence from a UK representative sample," LSE Research Online Documents on Economics 67554, London School of Economics and Political Science, LSE Library.
    7. Marc Oliver Rieger & Mei Wang & Thorsten Hens, 2015. "Risk Preferences Around the World," Management Science, INFORMS, vol. 61(3), pages 637-648, March.
    8. Liu, Elaine M. & Huang, JiKun, 2013. "Risk preferences and pesticide use by cotton farmers in China," Journal of Development Economics, Elsevier, vol. 103(C), pages 202-215.
    9. Herrmann, Tabea & Hübler, Olaf & Menkhoff, Lukas & Schmidt, Ulrich, 2016. "Allais for the poor," Kiel Working Papers 2036, Kiel Institute for the World Economy.
    10. James Alm & Antoine Malézieux, 2021. "40 years of tax evasion games: a meta-analysis," Experimental Economics, Springer;Economic Science Association, vol. 24(3), pages 699-750, September.
    11. Tamás Csermely & Alexander Rabas, 2016. "How to reveal people’s preferences: Comparing time consistency and predictive power of multiple price list risk elicitation methods," Journal of Risk and Uncertainty, Springer, vol. 53(2), pages 107-136, December.
    12. Dixit, Vinayak V. & Harb, Rami C. & Martínez-Correa, Jimmy & Rutström, Elisabet E., 2015. "Measuring risk aversion to guide transportation policy: Contexts, incentives, and respondents," Transportation Research Part A: Policy and Practice, Elsevier, vol. 80(C), pages 15-34.
    13. Holzmeister, Felix & Stefan, Matthias, 2019. "The Risk Elicitation Puzzle Revisited: Across-Methods (In)consistency?," OSF Preprints pj9u2, Center for Open Science.
    14. Tabea Herrmann & Olaf Hübler & Lukas Menkhoff & Ulrich Schmidt, 2017. "Allais for the poor: Relations to ability, information processing, and risk attitudes," Journal of Risk and Uncertainty, Springer, vol. 54(2), pages 129-156, April.
    15. Gary Charness & Thomas Garcia & Theo Offerman & Marie Claire Villeval, 2020. "Do measures of risk attitude in the laboratory predict behavior under risk in and outside of the laboratory?," Journal of Risk and Uncertainty, Springer, vol. 60(2), pages 99-123, April.
    16. Jiakun Zheng & Ling Zhou, 2025. "Too risky to hedge: An experiment on narrow bracketing," Post-Print hal-05063379, HAL.
    17. Kerri Brick & Martine Visser & Justine Burns, 2012. "Risk Aversion: Experimental Evidence from South African Fishing Communities," American Journal of Agricultural Economics, Agricultural and Applied Economics Association, vol. 94(1), pages 133-152.
    18. Holden, Stein T. & Tilahun, Mesfin, 2021. "Shocks and Stability of Risk Preferences," CLTS Working Papers 5/21, Norwegian University of Life Sciences, Centre for Land Tenure Studies.
    19. Ola Andersson & Håkan J. Holm & Jean-Robert Tyran & Erik Wengström, 2020. "Robust inference in risk elicitation tasks," Journal of Risk and Uncertainty, Springer, vol. 61(3), pages 195-209, December.
    20. Jeffrey Butler & Luigi Guiso & Tullio Jappelli, 2014. "The role of intuition and reasoning in driving aversion to risk and ambiguity," Theory and Decision, Springer, vol. 77(4), pages 455-484, December.

    More about this item

    NEP fields

    This paper has been announced in the following NEP Reports:

    Statistics

    Access and download statistics

    Corrections

    All material on this site has been provided by the respective publishers and authors. You can help correct errors and omissions. When requesting a correction, please mention this item's handle: RePEc:arx:papers:2406.05972. See general information about how to correct material in RePEc.

    If you have authored this item and are not yet registered with RePEc, we encourage you to do it here. This allows to link your profile to this item. It also allows you to accept potential citations to this item that we are uncertain about.

    If CitEc recognized a bibliographic reference but did not link an item in RePEc to it, you can help with this form .

    If you know of missing items citing this one, you can help us creating those links by adding the relevant references in the same way as above, for each refering item. If you are a registered author of this item, you may also want to check the "citations" tab in your RePEc Author Service profile, as there may be some citations waiting for confirmation.

    For technical questions regarding this item, or to correct its authors, title, abstract, bibliographic or download information, contact: arXiv administrators (email available below). General contact details of provider: http://arxiv.org/ .

    Please note that corrections may take a couple of weeks to filter through the various RePEc services.

    IDEAS is a RePEc service. RePEc uses bibliographic data supplied by the respective publishers.