IDEAS home Printed from https://ideas.repec.org/a/plo/pcbi00/1013445.html
   My bibliography  Save this article

A nonlinear relationship between prediction errors and learning rates in human reinforcement-learning

Author

Listed:
  • Boluwatife Ikwunne
  • Jolie Parham
  • Erdem Pulcu

Abstract

Reinforcement-learning (RL) models have been pivotal to our understanding of how agents perform learning-based adaptions in dynamically changing environments. However, the exact nature of the relationship (e.g., linear, logarithmic etc.) between key components of RL models such as prediction errors (PEs; the difference between the agent’s expectation and the actual outcome) and learning rates (a coefficient used by agents to update their beliefs about the environment) has not been studied in detail. Here, across (i) simulations, (ii) reanalyses of readily available datasets and (iii) a novel experiment, we demonstrate that the relationship between PEs and learning rates is (i) nonlinear over the PE/ learning rates space, and (ii) it can be accounted for by an exponential-logarithmic function that can transform the magnitude of PEs instantaneously to learning rates in a novel RL model. In line with the temporal predictions of this model, we show that physiological correlates of learning rates accumulate while learners observe the outcome of their choices and update their beliefs about the environment.Author summary: All living agents constantly learn and adapt to changes in their environments, a process normally hidden from observation and often understood through computational models. A key part of this is how we react to “prediction errors” – the difference between what we expect and what actually happens. These differences influence our “learning rate,” which is how quickly we update our beliefs about the world, and not much scientific work has been done on the exact relationship between prediction errors and learning rates. Our work demonstrates that this relationship is not always simple, or linear. Instead, we suggest that it is non-linear and depends on different types of uncertainty in the environment. Furthermore, physiological activity measured by recording pupil size during learning suggest that correlations linked to learning rates build up as we observe the outcomes of our actions and adjust our beliefs, supporting our proposed model accounting for how our brains use unexpected events to refine learning.

Suggested Citation

  • Boluwatife Ikwunne & Jolie Parham & Erdem Pulcu, 2025. "A nonlinear relationship between prediction errors and learning rates in human reinforcement-learning," PLOS Computational Biology, Public Library of Science, vol. 21(9), pages 1-21, September.
  • Handle: RePEc:plo:pcbi00:1013445
    DOI: 10.1371/journal.pcbi.1013445
    as

    Download full text from publisher

    File URL: https://journals.plos.org/ploscompbiol/article?id=10.1371/journal.pcbi.1013445
    Download Restriction: no

    File URL: https://journals.plos.org/ploscompbiol/article/file?id=10.1371/journal.pcbi.1013445&type=printable
    Download Restriction: no

    File URL: https://libkey.io/10.1371/journal.pcbi.1013445?utm_source=ideas
    LibKey link: if access is restricted and if your library uses this service, LibKey will redirect you to where you can use your library subscription to access this item
    ---><---

    References listed on IDEAS

    as
    1. Erev, Ido & Roth, Alvin E, 1998. "Predicting How People Play Games: Reinforcement Learning in Experimental Games with Unique, Mixed Strategy Equilibria," American Economic Review, American Economic Association, vol. 88(4), pages 848-881, September.
    2. Peyman Khorsand & Alireza Soltani, 2017. "Optimal structure of metaplasticity for adaptive learning," PLOS Computational Biology, Public Library of Science, vol. 13(6), pages 1-22, June.
    Full references (including those not matched with items on IDEAS)

    Most related items

    These are the items that most often cite the same works as this one and are cited by the same works as this one.
    1. Philippe Jehiel & Aviman Satpathy, 2024. "Learning to be Indifferent in Complex Decisions: A Coarse Payoff-Assessment Model," Papers 2412.09321, arXiv.org, revised Dec 2024.
    2. Noah Gans & George Knox & Rachel Croson, 2007. "Simple Models of Discrete Choice and Their Performance in Bandit Experiments," Manufacturing & Service Operations Management, INFORMS, vol. 9(4), pages 383-408, December.
    3. Terry E. Daniel & Eyran J. Gisches & Amnon Rapoport, 2009. "Departure Times in Y-Shaped Traffic Networks with Multiple Bottlenecks," American Economic Review, American Economic Association, vol. 99(5), pages 2149-2176, December.
    4. Iftekhar, M. S. & Tisdell, J. G., "undated". "Learning in repeated multiple unit combinatorial auctions: An experimental study," Working Papers 267301, University of Western Australia, School of Agricultural and Resource Economics.
    5. Ianni, A., 2002. "Reinforcement learning and the power law of practice: some analytical results," Discussion Paper Series In Economics And Econometrics 203, Economics Division, School of Social Sciences, University of Southampton.
    6. Benaïm, Michel & Hofbauer, Josef & Hopkins, Ed, 2009. "Learning in games with unstable equilibria," Journal of Economic Theory, Elsevier, vol. 144(4), pages 1694-1709, July.
    7. Oechssler, Jorg & Schipper, Burkhard, 2003. "Can you guess the game you are playing?," Games and Economic Behavior, Elsevier, vol. 43(1), pages 137-152, April.
    8. Erhao Xie, 2019. "Monetary Payoff and Utility Function in Adaptive Learning Models," Staff Working Papers 19-50, Bank of Canada.
    9. B Kelsey Jack, 2009. "Auctioning Conservation Contracts in Indonesia - Participant Learning in Multiple Trial Rounds," CID Working Papers 35, Center for International Development at Harvard University.
    10. Isabelle Brocas & Juan D. Carrillo, 2022. "The development of randomization and deceptive behavior in mixed strategy games," Quantitative Economics, Econometric Society, vol. 13(2), pages 825-862, May.
    11. James Choi & David Laibson & Brigitte Madrain & Andrew Metrick, 2007. "Reinforcement Learning in Investment Behavior," Levine's Bibliography 122247000000001737, UCLA Department of Economics.
    12. Enkhtaivan, Bolortuya & Davaadorj, Zagdbazar, 2021. "Do they recall their past? CEOs’ liquidity policies across firms as they switch jobs," Journal of Behavioral and Experimental Finance, Elsevier, vol. 29(C).
    13. Anthony Ziegelmeyer & Frédéric Koessler & Kene Boun My & Laurent Denant-Boèmont, 2008. "Road Traffic Congestion and Public Information: An Experimental Investigation," Journal of Transport Economics and Policy, University of Bath, vol. 42(1), pages 43-82, January.
    14. DeJong, D.V. & Blume, A. & Neumann, G., 1998. "Learning in Sender-Receiver Games," Other publications TiSEM 4a8b4f46-f30b-4ad2-bb0c-1, Tilburg University, School of Economics and Management.
    15. Sergiu Hart & Andreu Mas-Colell, 2013. "A Simple Adaptive Procedure Leading To Correlated Equilibrium," World Scientific Book Chapters, in: Simple Adaptive Strategies From Regret-Matching to Uncoupled Dynamics, chapter 2, pages 17-46, World Scientific Publishing Co. Pte. Ltd..
    16. Marco LiCalzi & Roland Mühlenbernd, 2022. "Feature-weighted categorized play across symmetric games," Experimental Economics, Springer;Economic Science Association, vol. 25(3), pages 1052-1078, June.
    17. Ferraro Paul J & Vossler Christian A, 2010. "The Source and Significance of Confusion in Public Goods Experiments," The B.E. Journal of Economic Analysis & Policy, De Gruyter, vol. 10(1), pages 1-42, July.
    18. Mariano Runco, 2013. "Estimating depth of reasoning in a repeated guessing game with no feedback," Experimental Economics, Springer;Economic Science Association, vol. 16(3), pages 402-413, September.
    19. Fernando Lozano & Jaime Lozano & Mario García, 2007. "An artificial economy based on reinforcement learning and agent based modeling," Documentos de Trabajo 3907, Universidad del Rosario.
    20. Micha Heilbron & Florent Meyniel, 2019. "Confidence resets reveal hierarchical adaptive learning in humans," PLOS Computational Biology, Public Library of Science, vol. 15(4), pages 1-24, April.

    More about this item

    Statistics

    Access and download statistics

    Corrections

    All material on this site has been provided by the respective publishers and authors. You can help correct errors and omissions. When requesting a correction, please mention this item's handle: RePEc:plo:pcbi00:1013445. See general information about how to correct material in RePEc.

    If you have authored this item and are not yet registered with RePEc, we encourage you to do it here. This allows to link your profile to this item. It also allows you to accept potential citations to this item that we are uncertain about.

    If CitEc recognized a bibliographic reference but did not link an item in RePEc to it, you can help with this form .

    If you know of missing items citing this one, you can help us creating those links by adding the relevant references in the same way as above, for each refering item. If you are a registered author of this item, you may also want to check the "citations" tab in your RePEc Author Service profile, as there may be some citations waiting for confirmation.

    For technical questions regarding this item, or to correct its authors, title, abstract, bibliographic or download information, contact: ploscompbiol (email available below). General contact details of provider: https://journals.plos.org/ploscompbiol/ .

    Please note that corrections may take a couple of weeks to filter through the various RePEc services.

    IDEAS is a RePEc service. RePEc uses bibliographic data supplied by the respective publishers.