When Does Model-Based Control Pay Off?

My bibliography Save this article

When Does Model-Based Control Pay Off?

Author

Listed:

Wouter Kool
Fiery A Cushman
Samuel J Gershman

Registered:

Abstract

Many accounts of decision making and reinforcement learning posit the existence of two distinct systems that control choice: a fast, automatic system and a slow, deliberative system. Recent research formalizes this distinction by mapping these systems to “model-free” and “model-based” strategies in reinforcement learning. Model-free strategies are computationally cheap, but sometimes inaccurate, because action values can be accessed by inspecting a look-up table constructed through trial-and-error. In contrast, model-based strategies compute action values through planning in a causal model of the environment, which is more accurate but also more cognitively demanding. It is assumed that this trade-off between accuracy and computational demand plays an important role in the arbitration between the two strategies, but we show that the hallmark task for dissociating model-free and model-based strategies, as well as several related variants, do not embody such a trade-off. We describe five factors that reduce the effectiveness of the model-based strategy on these tasks by reducing its accuracy in estimating reward outcomes and decreasing the importance of its choices. Based on these observations, we describe a version of the task that formally and empirically obtains an accuracy-demand trade-off between model-free and model-based strategies. Moreover, we show that human participants spontaneously increase their reliance on model-based control on this task, compared to the original paradigm. Our novel task and our computational analyses may prove important in subsequent empirical investigations of how humans balance accuracy and demand.Author Summary: When you make a choice about what groceries to get for dinner, you can rely on two different strategies. You can make your choice by relying on habit, simply buying the items you need to make a meal that is second nature to you. However, you can also plan your actions in a more deliberative way, realizing that the friend who will join you is a vegetarian, and therefore you should not make the burgers that have become a staple in your cooking. These two strategies differ in how computationally demanding and accurate they are. While the habitual strategy is less computationally demanding (costs less effort and time), the deliberative strategy is more accurate. Scientists have been able to study the distinction between these strategies using a task that allows them to measure how much people rely on habit and planning strategies. Interestingly, we have discovered that in this task, the deliberative strategy does not increase performance accuracy, and hence does not induce a trade-off between accuracy and demand. We describe why this happens, and improve the task so that it embodies an accuracy-demand trade-off, providing evidence for theories of cost-based arbitration between cognitive strategies.

Suggested Citation

Wouter Kool & Fiery A Cushman & Samuel J Gershman, 2016. "When Does Model-Based Control Pay Off?," PLOS Computational Biology, Public Library of Science, vol. 12(8), pages 1-34, August.

Handle: RePEc:plo:pcbi00:1005090
DOI: 10.1371/journal.pcbi.1005090

Download full text from publisher

References listed on IDEAS

Amir Dezfouli & Bernard W Balleine, 2013. "Actions, Action Sequences and Habits: Evidence That Goal-Directed and Habitual Action Control Are Hierarchically Organized," PLOS Computational Biology, Public Library of Science, vol. 9(12), pages 1-14, December.
David K. Levine & Drew Fudenberg, 2006. "A Dual-Self Model of Impulse Control," American Economic Review, American Economic Association, vol. 96(5), pages 1449-1476, December.
- Drew Fudenberg & David K. Levine, 2004. "A Dual Self Model of Impulse Control," Harvard Institute of Economic Research Working Papers 2049, Harvard - Institute of Economic Research.
- Fudenberg, Drew & Levine, David, 2006. "A Dual-Self Model of Impulse Control," Scholarly Articles 3196335, Harvard University Department of Economics.
- Drew Fudenberg & David K Levine, 2005. "A Dual Self Model of Impulse Control," Levine's Working Paper Archive 618897000000000876, David K. Levine.
- Drew Fudenberg & David K. Levine, 2006. "A Dual Self Model of Impulse Control," Harvard Institute of Economic Research Working Papers 2112, Harvard - Institute of Economic Research.
Peter Smittenaar & George Prichard & Thomas H B FitzGerald & Joern Diedrichsen & Raymond J Dolan, 2014. "Transcranial Direct Current Stimulation of Right Dorsolateral Prefrontal Cortex Does Not Affect Model-Based or Model-Free Reinforcement Learning in Humans," PLOS ONE, Public Library of Science, vol. 9(1), pages 1-8, January.
Andrew Westbrook & Daria Kester & Todd S Braver, 2013. "What Is the Subjective Cost of Cognitive Effort? Load, Trait, and Aging Effects Revealed by Economic Preference," PLOS ONE, Public Library of Science, vol. 8(7), pages 1-8, July.
Thomas Akam & Rui Costa & Peter Dayan, 2015. "Simple Plans or Sophisticated Habits? State, Transition and Learning Interactions in the Two-Step Task," PLOS Computational Biology, Public Library of Science, vol. 11(12), pages 1-25, December.

Full references (including those not matched with items on IDEAS)

Citations

Citations are extracted by the CitEc Project, subscribe to its RSS feed for this item.

Cited by:

Bruno Miranda & W M Nishantha Malalasekera & Timothy E Behrens & Peter Dayan & Steven W Kennerley, 2020. "Combined model-free and model-sensitive reinforcement learning in non-human primates," PLOS Computational Biology, Public Library of Science, vol. 16(6), pages 1-25, June.
Carolina Feher da Silva & Todd A Hare, 2018. "A note on the analysis of two-stage task results: How changes in task structure affect what model-free and model-based strategies predict about the effects of reward and transition on the stay probabi," PLOS ONE, Public Library of Science, vol. 13(4), pages 1-13, April.
Mikhail S. Spektor & Hannah Seidler, 2022. "Violations of economic rationality due to irrelevant information during learning in decision from experience," Judgment and Decision Making, Society for Judgment and Decision Making, vol. 17(2), pages 425-448, March.
Nitzan Shahar & Tobias U Hauser & Michael Moutoussis & Rani Moran & Mehdi Keramati & NSPN consortium & Raymond J Dolan, 2019. "Improving the reliability of model-based decision-making estimates in the two-stage decision task with reaction-times and drift-diffusion modeling," PLOS Computational Biology, Public Library of Science, vol. 15(2), pages 1-25, February.
repec:cup:judgdm:v:17:y:2022:i:2:p:425-448 is not listed on IDEAS
He A Xu & Alireza Modirshanechi & Marco P Lehmann & Wulfram Gerstner & Michael H Herzog, 2021. "Novelty is not surprise: Human exploratory and adaptive behavior in sequential decision-making," PLOS Computational Biology, Public Library of Science, vol. 17(6), pages 1-32, June.

Most related items

These are the items that most often cite the same works as this one and are cited by the same works as this one.

Bruno Miranda & W M Nishantha Malalasekera & Timothy E Behrens & Peter Dayan & Steven W Kennerley, 2020. "Combined model-free and model-sensitive reinforcement learning in non-human primates," PLOS Computational Biology, Public Library of Science, vol. 16(6), pages 1-25, June.
Amir Dezfouli & Bernard W Balleine, 2019. "Learning the structure of the world: The adaptive nature of state-space and action representations in multi-stage decision-making," PLOS Computational Biology, Public Library of Science, vol. 15(9), pages 1-22, September.
Thomas Akam & Rui Costa & Peter Dayan, 2015. "Simple Plans or Sophisticated Habits? State, Transition and Learning Interactions in the Two-Step Task," PLOS Computational Biology, Public Library of Science, vol. 11(12), pages 1-25, December.
Leonhard K. Lades & Liam Delaney, 2024. "Self-control failures, as judged by themselves," Palgrave Communications, Palgrave Macmillan, vol. 11(1), pages 1-14, December.
Luigi Guiso, 2015. "A Test of Narrow Framing and its Origin," Italian Economic Journal: A Continuation of Rivista Italiana degli Economisti and Giornale degli Economisti, Springer;Società Italiana degli Economisti (Italian Economic Association), vol. 1(1), pages 61-100, March.
- Luigi Guiso, 2008. "A Test of Narrow Framing and its Origin," EIEF Working Papers Series 0818, Einaudi Institute for Economics and Finance (EIEF), revised Dec 2008.
- Guiso, Luigi, 2009. "A test of narrow framing and its origin," CEPR Discussion Papers 7112, C.E.P.R. Discussion Papers.
- Luigi Guiso, 2009. "A Test of Narrow Framing and its Origin," Economics Working Papers ECO2009/02, European University Institute.
repec:plo:pone00:0220282 is not listed on IDEAS
Lohse, Johannes & Goeschl, Timo & Diederich , Johannes, 2014. "Giving is a question of time: Response times and contributions to a real world public good," Working Papers 0566, University of Heidelberg, Department of Economics.
André Lapied & Thomas Rongiconi, 2013. "Ambiguity as a Source of Temptation: Modeling Unstable Beliefs," Working Papers halshs-00797631, HAL.
- André Lapied & Thomas Rongiconi, 2013. "Ambiguity as a Source of Temptation: Modeling Unstable Beliefs," AMSE Working Papers 1316, Aix-Marseille School of Economics, France.
Eddie Dekel & Barton L. Lipman & Aldo Rustichini, 2009. "Temptation-Driven Preferences," The Review of Economic Studies, Review of Economic Studies Ltd, vol. 76(3), pages 937-971.
- Eddie Dekel & Barton L. Lipman & Aldo Rustichini, 2005. "Temptation–Driven Preferences," Boston University - Department of Economics - Working Papers Series WP2005-005, Boston University - Department of Economics.
- Dekel, Eddie & Lipman, Barton L. & Rustichini, Aldo, 2006. "Temptation-Driven Preferences," Foerder Institute for Economic Research Working Papers 275695, Tel-Aviv University > Foerder Institute for Economic Research.
- Eddie Dekel & Barton L. Lipman & Aldo Rustichini, 2006. "Temptation–Driven Preferences," Boston University - Department of Economics - Working Papers Series WP2006-024, Boston University - Department of Economics.
- Eddie Dekel & Barton Lipman & Aldo Rustichini, 2006. "Temptation–Driven Preferences," Discussion Papers 1423, Northwestern University, Center for Mathematical Studies in Economics and Management Science.
Bryan, Gharad & Karlan, Dean & Nelson, Scott, 2009. "Commitment Contracts," Working Papers 73, Yale University, Department of Economics.
- Gharad Bryan & Dean Karlan & Scott Nelson, 2009. "Commitment Contracts," Working Papers 980, Economic Growth Center, Yale University.
- Bryan, Gharad & Karlan, Dean S. & Nelson, Scott, 2009. "Commitment Contracts," Center Discussion Papers 54536, Yale University, Economic Growth Center.
Laureti, Carolina & Szafarz, Ariane, 2023. "Banking regulation and costless commitment contracts for time-inconsistent agents," Economic Modelling, Elsevier, vol. 129(C).
- Carolina Laureti & Ariane Szafarz, 2023. "Banking Regulation and Costless Commitment Contracts for Time-Inconsistent Agents," Working Papers CEB 23-010, ULB -- Universite Libre de Bruxelles.
Blair Cleave & Nikos Nikiforakis & Robert Slonim, 2013. "Is there selection bias in laboratory experiments? The case of social and risk preferences," Experimental Economics, Springer;Economic Science Association, vol. 16(3), pages 372-382, September.
- Cleave, Blair L. & Nikiforakis, Nikos & Slonim, Robert, 2011. "Is There Selection Bias in Laboratory Experiments? The Case of Social and Risk Preferences," IZA Discussion Papers 5488, Institute of Labor Economics (IZA).
- Nikos Nikiforakis & Blair L. Cleave & Robert Slonim, 2013. "Is there selection bias in laboratory experiments? The case of social and risk preferences," Post-Print halshs-00943212, HAL.
Ernesto Dal Bó & Marko Terviö, 2013. "Self-Esteem, Moral Capital, And Wrongdoing," Journal of the European Economic Association, European Economic Association, vol. 11(3), pages 599-663, June.
- Marko Tervio & Ernesto Dal Bo, 2008. "Self-esteem, Moral Capital, and Wrongdoing," 2008 Meeting Papers 245, Society for Economic Dynamics.
- Ernesto Dal Bó & Marko Terviö, 2008. "Self-Esteem, Moral Capital, and Wrongdoing," NBER Working Papers 14508, National Bureau of Economic Research, Inc.
repec:dgr:uvatin:20110044 is not listed on IDEAS
Strulik, Holger, 2023. "Hooked on weight control: An economic theory of anorexia nervosa and its impact on health and longevity," Journal of Health Economics, Elsevier, vol. 88(C).
- Strulik, Holger, 2021. "Hooked on weight control: An economic theory of anorexia nervosa, and its impact on health and longevity," University of Göttingen Working Papers in Economics 429, University of Goettingen, Department of Economics.
Stefano DellaVigna, 2009. "Psychology and Economics: Evidence from the Field," Journal of Economic Literature, American Economic Association, vol. 47(2), pages 315-372, June.
- Stefano DellaVigna, 2007. "Psychology and Economics: Evidence from the Field," NBER Working Papers 13420, National Bureau of Economic Research, Inc.
Brocas, Isabelle & Carrillo, Juan D., 2021. "Value computation and modulation: A neuroeconomic theory of self-control as constrained optimization," Journal of Economic Theory, Elsevier, vol. 198(C).
Hicken, Allen & Leider, Stephen & Ravanilla, Nico & Yang, Dean, 2018. "Temptation in vote-selling: Evidence from a field experiment in the Philippines," Journal of Development Economics, Elsevier, vol. 131(C), pages 1-14.
- Allen Hicken & Stephen G. Leider & Nico Ravanilla & Dean Yang, 2014. "Temptation in Vote-Selling: Evidence from a Field Experiment in the Philippines," CESifo Working Paper Series 4828, CESifo.
Khwaja, Ahmed & Silverman, Dan & Sloan, Frank, 2007. "Time preference, time discounting, and smoking decisions," Journal of Health Economics, Elsevier, vol. 26(5), pages 927-949, September.
- Ahmed Khwaja & Dan Silverman & Frank Sloan, 2006. "Time Preference, Time Discounting, and Smoking Decisions," NBER Working Papers 12615, National Bureau of Economic Research, Inc.
Andersen, Steffen & Harrison, Glenn W. & Lau, Morten Igel & Rutström, Elisabet E., 2014. "Dual criteria decisions," Journal of Economic Psychology, Elsevier, vol. 41(C), pages 101-113.
- Andersen, Steffen & Harrison, Glenn W. & Lau, Morten Igel & Rutström, Elisabet, 2009. "Dual Criteria Decisions," Working Papers 02-2009, Copenhagen Business School, Department of Economics.
Steffen Andersen & Cristian Badarinza & Lu Liu & Julie Marx & Tarun Ramadorai, 2022. "Reference Dependence in the Housing Market," American Economic Review, American Economic Association, vol. 112(10), pages 3398-3440, October.
- Andersen, Steffen & Badarinza, Cristian & Liu, Lu & Marx, Julie & Ramadorai, Tarun, 2022. "Reference Dependence in the Housing Market," CEPR Discussion Papers 14147, C.E.P.R. Discussion Papers.
Peysakhovich, Alexander, 2014. "How to commit (if you must): Commitment contracts and the dual-self model," Journal of Economic Behavior & Organization, Elsevier, vol. 101(C), pages 100-112.

More about this item

Statistics

Access and download statistics

Corrections

All material on this site has been provided by the respective publishers and authors. You can help correct errors and omissions. When requesting a correction, please mention this item's handle: RePEc:plo:pcbi00:1005090. See general information about how to correct material in RePEc.

If you have authored this item and are not yet registered with RePEc, we encourage you to do it here. This allows to link your profile to this item. It also allows you to accept potential citations to this item that we are uncertain about.

If CitEc recognized a bibliographic reference but did not link an item in RePEc to it, you can help with this form .

If you know of missing items citing this one, you can help us creating those links by adding the relevant references in the same way as above, for each refering item. If you are a registered author of this item, you may also want to check the "citations" tab in your RePEc Author Service profile, as there may be some citations waiting for confirmation.

For technical questions regarding this item, or to correct its authors, title, abstract, bibliographic or download information, contact: ploscompbiol (email available below). General contact details of provider: https://journals.plos.org/ploscompbiol/ .

Please note that corrections may take a couple of weeks to filter through the various RePEc services.

IDEAS is a RePEc service. RePEc uses bibliographic data supplied by the respective publishers.

Browse Econ Literature

More features

When Does Model-Based Control Pay Off?

Author

Abstract

Suggested Citation

Download full text from publisher

References listed on IDEAS

Citations

Most related items

More about this item

Statistics

Corrections

More services and features

MyIDEAS

Author registration

Rankings

RePEc Genealogy

RePEc Biblio

MPRA

New papers by email

EconAcademics

Plagiarism

About RePEc

RePEc home

Blog

Help/FAQ

RePEc team

Participating archives

Privacy statement

Help us

Corrections

Volunteers

Get papers listed

Open a RePEc archive

Get RePEc data