IDEAS home Printed from https://ideas.repec.org/a/eee/jeborg/v231y2025ics0167268125000216.html
   My bibliography  Save this article

Long-run choice anomalies in reinforcement learning with bounded memory

Author

Listed:
  • Giffin, Erin
  • Lillethun, Erik

Abstract

Violations of expected utility (EU) maximization have been demonstrated in many settings; however, anomalies are often reduced after repeated choices. We examine if sufficient experiential learning allows convergence to EU-maximization. In the model, a decision maker with long but finite memory repeatedly makes choices in the same decision problem with uncertainty. We focus on the existence and severity of a certain unambiguous type of long-run choice anomaly: ranking reversals (a non-EU maximizing action being most frequently chosen in the long run). We show reversals exist for almost all preferences, even in realistic examples. Reversals tend to happen when payoff differences are heavily skewed. Longer memory does not eliminate the possibility of ranking reversals, but it does make reversals less severe. Our key takeaway is that finite memory can produce major violations of the expected utility ranking even in a model where both memory and the decision-making process are unbiased.

Suggested Citation

  • Giffin, Erin & Lillethun, Erik, 2025. "Long-run choice anomalies in reinforcement learning with bounded memory," Journal of Economic Behavior & Organization, Elsevier, vol. 231(C).
  • Handle: RePEc:eee:jeborg:v:231:y:2025:i:c:s0167268125000216
    DOI: 10.1016/j.jebo.2025.106901
    as

    Download full text from publisher

    File URL: http://www.sciencedirect.com/science/article/pii/S0167268125000216
    Download Restriction: Full text for ScienceDirect subscribers only

    File URL: https://libkey.io/10.1016/j.jebo.2025.106901?utm_source=ideas
    LibKey link: if access is restricted and if your library uses this service, LibKey will redirect you to where you can use your library subscription to access this item
    ---><---

    As the access to this document is restricted, you may want to search for a different version of it.

    References listed on IDEAS

    as
    1. David Huffman & Collin Raymond & Julia Shvets, 2022. "Persistent Overconfidence and Biased Memory: Evidence from Managers," American Economic Review, American Economic Association, vol. 112(10), pages 3141-3175, October.
    2. Borgers, Tilman & Sarin, Rajiv, 2000. "Naive Reinforcement Learning with Endogenous Aspirations," International Economic Review, Department of Economics, University of Pennsylvania and Osaka University Institute of Social and Economic Research Association, vol. 41(4), pages 921-950, November.
    3. Daniel Kahneman & Amos Tversky, 2013. "Prospect Theory: An Analysis of Decision Under Risk," World Scientific Book Chapters, in: Leonard C MacLean & William T Ziemba (ed.), HANDBOOK OF THE FUNDAMENTALS OF FINANCIAL DECISION MAKING Part I, chapter 6, pages 99-127, World Scientific Publishing Co. Pte. Ltd..
    4. Shaun Larcom & Ferdinand Rauch & Tim Willems, 2017. "The Benefits of Forced Experimentation: Striking Evidence from the London Underground Network," The Quarterly Journal of Economics, President and Fellows of Harvard College, vol. 132(4), pages 2019-2055.
    5. Sarin, Rajiv, 2000. "Decision Rules with Bounded Memory," Journal of Economic Theory, Elsevier, vol. 90(1), pages 151-160, January.
    6. David Eil & Justin M. Rao, 2011. "The Good News-Bad News Effect: Asymmetric Processing of Objective Information about Yourself," American Economic Journal: Microeconomics, American Economic Association, vol. 3(2), pages 114-138, May.
    7. Andrea Wilson, 2014. "Bounded Memory and Biases in Information Processing," Econometrica, Econometric Society, vol. 82, pages 2257-2294, November.
    8. Vernon L. Smith, 1962. "An Experimental Study of Competitive Market Behavior," Journal of Political Economy, University of Chicago Press, vol. 70(2), pages 111-111.
    9. Yan Chen & Robert S. Gazzale, 2004. "When Does Learning in Games Generate Convergence to Nash Equilibria? The Role of Supermodularity in an Experimental Setting," Department of Economics Working Papers 2004-02, Department of Economics, Williams College.
    10. Cheung, Yin-Wong & Friedman, Daniel, 1997. "Individual Learning in Normal Form Games: Some Laboratory Results," Games and Economic Behavior, Elsevier, vol. 19(1), pages 46-76, April.
    11. Wei Chen & Shu-Yu Liu & Chih-Han Chen & Yi-Shan Lee, 2011. "Bounded Memory, Inertia, Sampling and Weighting Model for Market Entry Games," Games, MDPI, vol. 2(1), pages 1-13, March.
    12. Ignacio Palacios-Huerta, 2003. "Learning to Open Monty Hall's Doors," Experimental Economics, Springer;Economic Science Association, vol. 6(3), pages 235-251, November.
    13. Roth, Alvin E. & Erev, Ido, 1995. "Learning in extensive-form games: Experimental data and simple dynamic models in the intermediate term," Games and Economic Behavior, Elsevier, vol. 8(1), pages 164-212.
    14. Sandholm, William H. & Izquierdo, Segismundo S. & Izquierdo, Luis R., 2020. "Stability for best experienced payoff dynamics," Journal of Economic Theory, Elsevier, vol. 185(C).
    15. Colin Camerer & Teck-Hua Ho, 1999. "Experience-weighted Attraction Learning in Normal Form Games," Econometrica, Econometric Society, vol. 67(4), pages 827-874, July.
    16. John G. Cross, 1973. "A Stochastic Learning Model of Economic Behavior," The Quarterly Journal of Economics, President and Fellows of Harvard College, vol. 87(2), pages 239-266.
    17. Arifovic, Jasmina & Ledyard, John, 2011. "A behavioral model for mechanism design: Individual evolutionary learning," Journal of Economic Behavior & Organization, Elsevier, vol. 78(3), pages 374-395, May.
    18. Yan Chen & Robert Gazzale, 2004. "When Does Learning in Games Generate Convergence to Nash Equilibria? The Role of Supermodularity in an Experimental Setting," American Economic Review, American Economic Association, vol. 94(5), pages 1505-1535, December.
    19. Cox, James C. & Walker, Mark, 1998. "Learning to play Cournot duopoly strategies," Journal of Economic Behavior & Organization, Elsevier, vol. 36(2), pages 141-161, August.
    20. Andrea Amelio & Florian Zimmermann, 2023. "Motivated Memory in Economics—A Review," Games, MDPI, vol. 14(1), pages 1-15, January.
    21. Sandholm, William H. & Izquierdo, Segismundo S. & Izquierdo, Luis R., 2019. "Best experienced payoff dynamics and cooperation in the Centipede game," Theoretical Economics, Econometric Society, vol. 14(4), November.
    22. Gottlieb, Daniel, 2014. "Imperfect memory and choice under risk," Games and Economic Behavior, Elsevier, vol. 85(C), pages 127-158.
    Full references (including those not matched with items on IDEAS)

    Most related items

    These are the items that most often cite the same works as this one and are cited by the same works as this one.
    1. Jaspersen, Johannes G. & Montibeller, Gilberto, 2020. "On the learning patterns and adaptive behavior of terrorist organizations," European Journal of Operational Research, Elsevier, vol. 282(1), pages 221-234.
    2. Chmura, Thorsten & Goerg, Sebastian J. & Selten, Reinhard, 2012. "Learning in experimental 2×2 games," Games and Economic Behavior, Elsevier, vol. 76(1), pages 44-73.
    3. Jinkwon Lee, 2007. "Repetition And Financial Incentives In Economics Experiments," Journal of Economic Surveys, Wiley Blackwell, vol. 21(3), pages 628-681, July.
    4. Oyarzun, Carlos & Sarin, Rajiv, 2013. "Learning and risk aversion," Journal of Economic Theory, Elsevier, vol. 148(1), pages 196-225.
    5. Masiliūnas, Aidas, 2023. "Learning in rent-seeking contests with payoff risk and foregone payoff information," Games and Economic Behavior, Elsevier, vol. 140(C), pages 50-72.
    6. Yechiam, Eldad & Busemeyer, Jerome R., 2008. "Evaluating generalizability and parameter consistency in learning models," Games and Economic Behavior, Elsevier, vol. 63(1), pages 370-394, May.
    7. Jonathan Newton, 2018. "Evolutionary Game Theory: A Renaissance," Games, MDPI, vol. 9(2), pages 1-67, May.
    8. Shu-Heng Chen & Yi-Lin Hsieh, 2011. "Reinforcement Learning in Experimental Asset Markets," Eastern Economic Journal, Palgrave Macmillan;Eastern Economic Association, vol. 37(1), pages 109-133.
    9. Chen, Shu-Heng, 2012. "Varieties of agents in agent-based computational economics: A historical and an interdisciplinary perspective," Journal of Economic Dynamics and Control, Elsevier, vol. 36(1), pages 1-25.
    10. Atanasios Mitropoulos, 2001. "Learning Under Little Information: An Experiment on Mutual Fate Control," Game Theory and Information 0110003, University Library of Munich, Germany.
    11. Duffy, John, 2006. "Agent-Based Models and Human Subject Experiments," Handbook of Computational Economics, in: Leigh Tesfatsion & Kenneth L. Judd (ed.), Handbook of Computational Economics, edition 1, volume 2, chapter 19, pages 949-1011, Elsevier.
    12. Andreas Ortmann & Leonidas Spiliopoulos, 2017. "The beauty of simplicity? (Simple) heuristics and the opportunities yet to be realized," Chapters, in: Morris Altman (ed.), Handbook of Behavioural Economics and Smart Decision-Making, chapter 7, pages 119-136, Edward Elgar Publishing.
    13. Mitropoulos, Atanasios, 2001. "Learning under minimal information: An experiment on mutual fate control," Journal of Economic Psychology, Elsevier, vol. 22(4), pages 523-557, August.
    14. Spiliopoulos, Leonidas, 2008. "Do repeated game players detect patterns in opponents? Revisiting the Nyarko & Schotter belief elicitation experiment," MPRA Paper 6666, University Library of Munich, Germany.
    15. Terracol, Antoine & Vaksmann, Jonathan, 2009. "Dumbing down rational players: Learning and teaching in an experimental game," Journal of Economic Behavior & Organization, Elsevier, vol. 70(1-2), pages 54-71, May.
    16. Ianni, A., 2002. "Reinforcement learning and the power law of practice: some analytical results," Discussion Paper Series In Economics And Econometrics 203, Economics Division, School of Social Sciences, University of Southampton.
    17. Erhao Xie, 2019. "Monetary Payoff and Utility Function in Adaptive Learning Models," Staff Working Papers 19-50, Bank of Canada.
    18. Osili, Una Okonkwo & Paulson, Anna, 2014. "Crises and confidence: Systemic banking crises and depositor behavior," Journal of Financial Economics, Elsevier, vol. 111(3), pages 646-660.
    19. Cason, Timothy N. & Saijo, Tatsuyoshi & Yamato, Takehiko & Yokotani, Konomu, 2004. "Non-excludable public good experiments," Games and Economic Behavior, Elsevier, vol. 49(1), pages 81-102, October.
    20. Claude Meidinger, 2018. "Cooperation and evolution of meaning in senders-receivers games," Post-Print halshs-01960762, HAL.

    More about this item

    Keywords

    Expected utility; Choice anomalies; Reinforcement learning; Bounded memory; Markov chains;
    All these keywords.

    JEL classification:

    • D81 - Microeconomics - - Information, Knowledge, and Uncertainty - - - Criteria for Decision-Making under Risk and Uncertainty
    • D83 - Microeconomics - - Information, Knowledge, and Uncertainty - - - Search; Learning; Information and Knowledge; Communication; Belief; Unawareness
    • D91 - Microeconomics - - Micro-Based Behavioral Economics - - - Role and Effects of Psychological, Emotional, Social, and Cognitive Factors on Decision Making

    Statistics

    Access and download statistics

    Corrections

    All material on this site has been provided by the respective publishers and authors. You can help correct errors and omissions. When requesting a correction, please mention this item's handle: RePEc:eee:jeborg:v:231:y:2025:i:c:s0167268125000216. See general information about how to correct material in RePEc.

    If you have authored this item and are not yet registered with RePEc, we encourage you to do it here. This allows to link your profile to this item. It also allows you to accept potential citations to this item that we are uncertain about.

    If CitEc recognized a bibliographic reference but did not link an item in RePEc to it, you can help with this form .

    If you know of missing items citing this one, you can help us creating those links by adding the relevant references in the same way as above, for each refering item. If you are a registered author of this item, you may also want to check the "citations" tab in your RePEc Author Service profile, as there may be some citations waiting for confirmation.

    For technical questions regarding this item, or to correct its authors, title, abstract, bibliographic or download information, contact: Catherine Liu (email available below). General contact details of provider: http://www.elsevier.com/locate/jebo .

    Please note that corrections may take a couple of weeks to filter through the various RePEc services.

    IDEAS is a RePEc service. RePEc uses bibliographic data supplied by the respective publishers.