IDEAS home Printed from https://ideas.repec.org/p/arx/papers/2505.21371.html
   My bibliography  Save this paper

When Experimental Economics Meets Large Language Models: Evidence-based Tactics

Author

Listed:
  • Shu Wang
  • Zijun Yao
  • Shuhuai Zhang
  • Jianuo Gai
  • Tracy Xiao Liu
  • Songfa Zhong

Abstract

Advancements in large language models (LLMs) have sparked a growing interest in measuring and understanding their behavior through experimental economics. However, there is still a lack of established guidelines for designing economic experiments for LLMs. Inspired by principles from experimental economics with insights from LLM research in artificial intelligence, we outline key considerations in the experimental design and implementation stage, and perform two sets of experiments to assess the impact of these considerations on LLMs' responses. Based on our findings, we discuss seven practical tactics for conducting experiments with LLMs. Our study enhances the design, replicability, and generalizability of LLM experiments, and broadens the scope of experimental economics in the digital age.

Suggested Citation

  • Shu Wang & Zijun Yao & Shuhuai Zhang & Jianuo Gai & Tracy Xiao Liu & Songfa Zhong, 2025. "When Experimental Economics Meets Large Language Models: Evidence-based Tactics," Papers 2505.21371, arXiv.org, revised Jul 2025.
  • Handle: RePEc:arx:papers:2505.21371
    as

    Download full text from publisher

    File URL: http://arxiv.org/pdf/2505.21371
    File Function: Latest version
    Download Restriction: no
    ---><---

    References listed on IDEAS

    as
    1. Samuel Chang & Andrew Kennedy & Aaron Leonard & John A. List, 2024. "12 Best Practices for Leveraging Generative AI in Experimental Research," NBER Working Papers 33025, National Bureau of Economic Research, Inc.
    2. Syngjoo Choi & Shachar Kariv & Wieland M?ller & Dan Silverman, 2014. "Who Is (More) Rational?," American Economic Review, American Economic Association, vol. 104(6), pages 1518-1550, June.
    3. Pablo Brañas-Garza & Diego Jorrat & Antonio M. Espín & Angel Sánchez, 2023. "Paid and hypothetical time preferences are the same: lab, field and online evidence," Experimental Economics, Springer;Economic Science Association, vol. 26(2), pages 412-434, April.
    4. Li, Jing & Dow, William H & Kariv, Shachar, 2017. "Social preferences of future physicians," Department of Economics, Working Paper Series qt5vw9g5tj, Department of Economics, Institute for Business and Economic Research, UC Berkeley.
    5. Thomas Dohmen & Armin Falk & David Huffman & Uwe Sunde & Jürgen Schupp & Gert G. Wagner, 2011. "Individual Risk Attitudes: Measurement, Determinants, And Behavioral Consequences," Journal of the European Economic Association, European Economic Association, vol. 9(3), pages 522-550, June.
    6. Philip Brookins & Jason DeBacker, 2024. "Playing games with GPT: What can we learn about a large language model from canonical strategic games?," Economics Bulletin, AccessEcon, vol. 44(1), pages 25-37.
    7. Davis, Douglas D. & Holt, Charles a., 1993. "Experimental economics: Methods, problems and promise," Estudios Económicos, El Colegio de México, Centro de Estudios Económicos, vol. 8(2), pages 179-212.
    8. Smith, Vernon L, 1982. "Microeconomic Systems as an Experimental Science," American Economic Review, American Economic Association, vol. 72(5), pages 923-955, December.
    9. Mingshi Chen & Tracy Xiao Liu & You Shan & Shu Wang & Songfa Zhong & Yanju Zhou, 2025. "How General Are Measures of Choice Consistency? Evidence from Experimental and Scanner Data," Papers 2505.05275, arXiv.org.
    10. Muriel Niederle, 2025. "Experiments: Why, How, and A Users Guide for Producers as well as Consumers," NBER Working Papers 33630, National Bureau of Economic Research, Inc.
    11. Afriat, Sidney N, 1972. "Efficiency Estimation of Production Function," International Economic Review, Department of Economics, University of Pennsylvania and Osaka University Institute of Social and Economic Research Association, vol. 13(3), pages 568-598, October.
    12. Yoram Halevy & Dotan Persitz & Lanny Zrill, 2018. "Parametric Recoverability of Preferences," Journal of Political Economy, University of Chicago Press, vol. 126(4), pages 1558-1593.
    13. Ali Goli & Amandeep Singh, 2024. "Frontiers: Can Large Language Models Capture Human Preferences?," Marketing Science, INFORMS, vol. 43(4), pages 709-722, July.
    14. Federico Echenique & Sangmok Lee & Matthew Shum, 2011. "The Money Pump as a Measure of Revealed Preference Violations," Journal of Political Economy, University of Chicago Press, vol. 119(6), pages 1201-1223.
    15. John J. Horton, 2023. "Large Language Models as Simulated Economic Agents: What Can We Learn from Homo Silicus?," NBER Working Papers 31122, National Bureau of Economic Research, Inc.
    16. Syngjoo Choi & Raymond Fisman & Douglas Gale & Shachar Kariv, 2007. "Consistency, Heterogeneity, and Granularity of Individual Behavior under Uncertainty," Levine's Bibliography 321307000000000793, UCLA Department of Economics.
    17. John J. Horton, 2023. "Large Language Models as Simulated Economic Agents: What Can We Learn from Homo Silicus?," Papers 2301.07543, arXiv.org.
    18. Charles R. Plott & Vernon L. Smith, 1978. "An Experimental Examination of Two Exchange Institutions," The Review of Economic Studies, Review of Economic Studies Ltd, vol. 45(1), pages 133-153.
    19. Syngjoo Choi & Raymond Fisman & Douglas Gale & Shachar Kariv, 2007. "Consistency and Heterogeneity of Individual Behavior under Uncertainty," American Economic Review, American Economic Association, vol. 97(5), pages 1921-1938, December.
    20. Yang Chen & Samuel N. Kirshner & Anton Ovchinnikov & Meena Andiappan & Tracy Jenkin, 2025. "A Manager and an AI Walk into a Bar: Does ChatGPT Make Biased Decisions Like We Do?," Manufacturing & Service Operations Management, INFORMS, vol. 27(2), pages 354-368, March.
    21. Charles A. Holt & Susan K. Laury, 2002. "Risk Aversion and Incentive Effects," American Economic Review, American Economic Association, vol. 92(5), pages 1644-1655, December.
    22. Taylor Webb & Keith J. Holyoak & Hongjing Lu, 2023. "Emergent analogical reasoning in large language models," Nature Human Behaviour, Nature, vol. 7(9), pages 1526-1541, September.
    23. Branas-Garza, Pablo, 2007. "Promoting helping behavior with framing in dictator games," Journal of Economic Psychology, Elsevier, vol. 28(4), pages 477-486, August.
    24. Daniel Zizzo, 2010. "Experimenter demand effects in economic experiments," Experimental Economics, Springer;Economic Science Association, vol. 13(1), pages 75-98, March.
    25. Susan Laury & Melayne McInnes & J. Swarthout, 2009. "Insurance decisions for low-probability losses," Journal of Risk and Uncertainty, Springer, vol. 39(1), pages 17-44, August.
    26. Rachel Croson & Uri Gneezy, 2009. "Gender Differences in Preferences," Journal of Economic Literature, American Economic Association, vol. 47(2), pages 448-474, June.
    27. Gary Charness & Brian Jabarian & John A. List, 2025. "The next generation of experimental research with LLMs," Nature Human Behaviour, Nature, vol. 9(5), pages 833-835, May.
    28. Smith, Vernon L, 1976. "Experimental Economics: Induced Value Theory," American Economic Review, American Economic Association, vol. 66(2), pages 274-279, May.
    29. Camerer, Colin & Dreber, Anna & Forsell, Eskil & Ho, Teck-Hua & Huber, Jurgen & Johannesson, Magnus & Kirchler, Michael & Almenberg, Johan & Altmejd, Adam & Chan, Taizan & Heikensten, Emma & Holzmeist, 2016. "Evaluating replicability of laboratory experiments in Economics," MPRA Paper 75461, University Library of Munich, Germany.
    30. James Andreoni & John Miller, 2002. "Giving According to GARP: An Experimental Test of the Consistency of Preferences for Altruism," Econometrica, Econometric Society, vol. 70(2), pages 737-753, March.
    Full references (including those not matched with items on IDEAS)

    Most related items

    These are the items that most often cite the same works as this one and are cited by the same works as this one.
    1. Mingshi Chen & Tracy Xiao Liu & You Shan & Shu Wang & Songfa Zhong & Yanju Zhou, 2025. "How General Are Measures of Choice Consistency? Evidence from Experimental and Scanner Data," Papers 2505.05275, arXiv.org.
    2. Müller, Daniel, 2019. "The anatomy of distributional preferences with group identity," Journal of Economic Behavior & Organization, Elsevier, vol. 166(C), pages 785-807.
    3. Michele Garagnani, 2023. "The predictive power of risk elicitation tasks," Journal of Risk and Uncertainty, Springer, vol. 67(2), pages 165-192, October.
    4. Pawe{l} Dziewulski & Joshua Lanier & John K. -H. Quah, 2024. "Revealed preference and revealed preference cycles: a survey," Papers 2405.08459, arXiv.org.
    5. Mir Adnan Mahmood & John Rehbeck, 2022. "Correcting for Random Budgets in Revealed Preference Experiments," Games, MDPI, vol. 13(2), pages 1-14, April.
    6. Ferdinand M. Vieider & Peter Martinsson & Pham Khanh Nam & Nghi Truong, 2019. "Risk preferences and development revisited," Theory and Decision, Springer, vol. 86(1), pages 1-21, February.
    7. Dziewulski, Paweł, 2020. "Just-noticeable difference as a behavioural foundation of the critical cost-efficiency index," Journal of Economic Theory, Elsevier, vol. 188(C).
    8. Cappelen, Alexander W. & Kariv, Shachar & Sørensen, Erik Ø. & Tungodden, Bertil, 2023. "The development gap in economic rationality of future elites," Games and Economic Behavior, Elsevier, vol. 142(C), pages 866-878.
    9. Dziewulski, Paweł & Lanier, Joshua & Quah, John K.-H., 2024. "Revealed preference and revealed preference cycles: A survey," Journal of Mathematical Economics, Elsevier, vol. 113(C).
    10. Pawel Dziewulski, 2018. "Just-noticeable difference as a behavioural foundation of the critical cost-efficiency," Economics Series Working Papers 848, University of Oxford, Department of Economics.
    11. Pawel Dziewulski, 2021. "A comprehensive revealed preference approach to approximate utility maximisation," Working Paper Series 0621, Department of Economics, University of Sussex Business School.
    12. Changkuk Im & John Rehbeck, 2021. "Non-rationalizable Individuals, Stochastic Rationalizability, and Sampling," Papers 2102.03436, arXiv.org, revised Oct 2021.
    13. Ferdinand M. Vieider & Mathieu Lefebvre & Ranoua Bouchouicha & Thorsten Chmura & Rustamdjan Hakimov & Michal Krawczyk & Peter Martinsson, 2015. "Common Components Of Risk And Uncertainty Attitudes Across Contexts And Domains: Evidence From 30 Countries," Journal of the European Economic Association, European Economic Association, vol. 13(3), pages 421-452, June.
    14. Mackenzie Alston & Tatyana Deryugina & Olga Shurchkov, 2025. "Leaving Money on the Table," CESifo Working Paper Series 11788, CESifo.
    15. Thomas Demuynck & John Rehbeck, 2023. "Computing revealed preference goodness-of-fit measures with integer programming," Economic Theory, Springer;Society for the Advancement of Economic Theory (SAET), vol. 76(4), pages 1175-1195, November.
    16. Uttara Balakrishnan & Johannes Haushofer & Pamela Jakiela, 2020. "How soon is now? Evidence of present bias from convex time budget experiments," Experimental Economics, Springer;Economic Science Association, vol. 23(2), pages 294-321, June.
    17. Daniel Burghart & Paul Glimcher & Stephanie Lazzaro, 2013. "An expected utility maximizer walks into a bar..," Journal of Risk and Uncertainty, Springer, vol. 46(3), pages 215-246, June.
    18. Croson, Rachel & Gächter, Simon, 2010. "The science of experimental economics," Journal of Economic Behavior & Organization, Elsevier, vol. 73(1), pages 122-131, January.
    19. E. Cettolin & P. S. Dalton & W. J. Kop & W. Zhang, 2020. "Cortisol meets GARP: the effect of stress on economic rationality," Experimental Economics, Springer;Economic Science Association, vol. 23(2), pages 554-574, June.
    20. Heufer, Jan & Hjertstrand, Per, 2019. "Homothetic preferences revealed," Journal of Economic Behavior & Organization, Elsevier, vol. 157(C), pages 602-614.

    More about this item

    NEP fields

    This paper has been announced in the following NEP Reports:

    Statistics

    Access and download statistics

    Corrections

    All material on this site has been provided by the respective publishers and authors. You can help correct errors and omissions. When requesting a correction, please mention this item's handle: RePEc:arx:papers:2505.21371. See general information about how to correct material in RePEc.

    If you have authored this item and are not yet registered with RePEc, we encourage you to do it here. This allows to link your profile to this item. It also allows you to accept potential citations to this item that we are uncertain about.

    If CitEc recognized a bibliographic reference but did not link an item in RePEc to it, you can help with this form .

    If you know of missing items citing this one, you can help us creating those links by adding the relevant references in the same way as above, for each refering item. If you are a registered author of this item, you may also want to check the "citations" tab in your RePEc Author Service profile, as there may be some citations waiting for confirmation.

    For technical questions regarding this item, or to correct its authors, title, abstract, bibliographic or download information, contact: arXiv administrators (email available below). General contact details of provider: http://arxiv.org/ .

    Please note that corrections may take a couple of weeks to filter through the various RePEc services.

    IDEAS is a RePEc service. RePEc uses bibliographic data supplied by the respective publishers.