Author
Listed:
- Maximilian Puelma Touzel
- Paul Cisek
- Guillaume Lajoie
Abstract
Finding the right amount of deliberation, between insufficient and excessive, is a hard decision making problem that depends on the value we place on our time. Average-reward, putatively encoded by tonic dopamine, serves in existing reinforcement learning theory as the opportunity cost of time, including deliberation time. Importantly, this cost can itself vary with the environmental context and is not trivial to estimate. Here, we propose how the opportunity cost of deliberation can be estimated adaptively on multiple timescales to account for non-stationary contextual factors. We use it in a simple decision-making heuristic based on average-reward reinforcement learning (AR-RL) that we call Performance-Gated Deliberation (PGD). We propose PGD as a strategy used by animals wherein deliberation cost is implemented directly as urgency, a previously characterized neural signal effectively controlling the speed of the decision-making process. We show PGD outperforms AR-RL solutions in explaining behaviour and urgency of non-human primates in a context-varying random walk prediction task and is consistent with relative performance and urgency in a context-varying random dot motion task. We make readily testable predictions for both neural activity and behaviour.Author summary: The value we place on our time impacts what we choose to do with it. Value our time too little, and we obsess over all details. Value it too much, and we rush carelessly to move on. How we value our time and how this value affects how much of it we allocate to tasks is not well-understood. The related cognitive processes are nevertheless thought to play a role in a wide range of diseases from Parkinson’s to addiction. We propose a general strategy that balances the expected value of deliberation with the time spent, where time is valued according to recent performance. We found that recorded behaviour and brain activity from a previous experiment using non-human primates could be explained by this simple decision-making strategy. We show that this strategy explains how a brain signal called ‘urgency’, which limits how long subjects deliberate, varies with context. Our work helps to integrate the neuroscience of reward representations and the brain dynamics associated with deliberation.
Suggested Citation
Maximilian Puelma Touzel & Paul Cisek & Guillaume Lajoie, 2022.
"Performance-gated deliberation: A context-adapted strategy in which urgency is opportunity cost,"
PLOS Computational Biology, Public Library of Science, vol. 18(5), pages 1-33, May.
Handle:
RePEc:plo:pcbi00:1010080
DOI: 10.1371/journal.pcbi.1010080
Download full text from publisher
Corrections
All material on this site has been provided by the respective publishers and authors. You can help correct errors and omissions. When requesting a correction, please mention this item's handle: RePEc:plo:pcbi00:1010080. See general information about how to correct material in RePEc.
If you have authored this item and are not yet registered with RePEc, we encourage you to do it here. This allows to link your profile to this item. It also allows you to accept potential citations to this item that we are uncertain about.
We have no bibliographic references for this item. You can help adding them by using this form .
If you know of missing items citing this one, you can help us creating those links by adding the relevant references in the same way as above, for each refering item. If you are a registered author of this item, you may also want to check the "citations" tab in your RePEc Author Service profile, as there may be some citations waiting for confirmation.
For technical questions regarding this item, or to correct its authors, title, abstract, bibliographic or download information, contact: ploscompbiol (email available below). General contact details of provider: https://journals.plos.org/ploscompbiol/ .
Please note that corrections may take a couple of weeks to filter through
the various RePEc services.