Human decision making balances reward maximization and policy compression

My bibliography Save this article

Human decision making balances reward maximization and policy compression

Author

Listed:

Lucy Lai
Samuel J Gershman

Registered:

Abstract

Policy compression is a computational framework that describes how capacity-limited agents trade reward for simpler action policies to reduce cognitive cost. In this study, we present behavioral evidence that humans prefer simpler policies, as predicted by a capacity-limited reinforcement learning model. Across a set of tasks, we find that people exploit structure in the relationships between states, actions, and rewards to “compress” their policies. In particular, compressed policies are systematically biased towards actions with high marginal probability, thereby discarding some state information. This bias is greater when there is redundancy in the reward-maximizing action policy across states, and increases with memory load. These results could not be explained qualitatively or quantitatively by models that did not make use of policy compression under a capacity limit. We also confirmed the prediction that time pressure should further reduce policy complexity and increase action bias, based on the hypothesis that actions are selected via time-dependent decoding of a compressed code. These findings contribute to a deeper understanding of how humans adapt their decision-making strategies under cognitive resource constraints.Author summary: Decision making taxes cognitive resources. For example, when shopping for groceries on a budget, we must evaluate which brand offers the best value for the price. But time constraints or mental fatigue can often steer us towards familiar choices, such as sticking to the same brand. To understand how cognitive resource limitations affect human decision making, we conducted a study in which we manipulated the number of optimal choices and the time limit within which choices were made. Across three tasks, we found that people utilize task structure to compress the amount of information factored into their decision making. Information compression biases people towards their past choices. This bias persists even when multiple optimal choices are available, and intensifies under cognitive load and time pressure. A computational model of decision making under cognitive constraints accurately describes the experimental data. Our findings may have the potential to inform the design of choice environments that better align with human decision biases.

Suggested Citation

Lucy Lai & Samuel J Gershman, 2024. "Human decision making balances reward maximization and policy compression," PLOS Computational Biology, Public Library of Science, vol. 20(4), pages 1-32, April.

Handle: RePEc:plo:pcbi00:1012057
DOI: 10.1371/journal.pcbi.1012057

Download full text from publisher

More about this item

Statistics

Access and download statistics

Corrections

All material on this site has been provided by the respective publishers and authors. You can help correct errors and omissions. When requesting a correction, please mention this item's handle: RePEc:plo:pcbi00:1012057. See general information about how to correct material in RePEc.

If you have authored this item and are not yet registered with RePEc, we encourage you to do it here. This allows to link your profile to this item. It also allows you to accept potential citations to this item that we are uncertain about.

We have no bibliographic references for this item. You can help adding them by using this form .

If you know of missing items citing this one, you can help us creating those links by adding the relevant references in the same way as above, for each refering item. If you are a registered author of this item, you may also want to check the "citations" tab in your RePEc Author Service profile, as there may be some citations waiting for confirmation.

For technical questions regarding this item, or to correct its authors, title, abstract, bibliographic or download information, contact: ploscompbiol (email available below). General contact details of provider: https://journals.plos.org/ploscompbiol/ .

Please note that corrections may take a couple of weeks to filter through the various RePEc services.

IDEAS is a RePEc service. RePEc uses bibliographic data supplied by the respective publishers.

Browse Econ Literature

More features

Human decision making balances reward maximization and policy compression

Author

Abstract

Suggested Citation

Download full text from publisher

More about this item

Statistics

Corrections

More services and features

MyIDEAS

Author registration

Rankings

RePEc Genealogy

RePEc Biblio

MPRA

New papers by email

EconAcademics

Plagiarism

About RePEc

RePEc home

Blog

Help/FAQ

RePEc team

Participating archives

Privacy statement

Help us

Corrections

Volunteers

Get papers listed

Open a RePEc archive

Get RePEc data