Mixtures of strategies underlie rodent behavior during reversal learning

Mixtures of strategies underlie rodent behavior during reversal learning

Author

Listed:

Nhat Minh Le
Murat Yildirim
Yizhi Wang
Hiroki Sugihara
Mehrdad Jazayeri
Mriganka Sur

Abstract

In reversal learning tasks, the behavior of humans and animals is often assumed to be uniform within single experimental sessions to facilitate data analysis and model fitting. However, behavior of agents can display substantial variability in single experimental sessions, as they execute different blocks of trials with different transition dynamics. Here, we observed that in a deterministic reversal learning task, mice display noisy and sub-optimal choice transitions even at the expert stages of learning. We investigated two sources of the sub-optimality in the behavior. First, we found that mice exhibit a high lapse rate during task execution, as they reverted to unrewarded directions after choice transitions. Second, we unexpectedly found that a majority of mice did not execute a uniform strategy, but rather mixed between several behavioral modes with different transition dynamics. We quantified the use of such mixtures with a state-space model, block Hidden Markov Model (block HMM), to dissociate the mixtures of dynamic choice transitions in individual blocks of trials. Additionally, we found that blockHMM transition modes in rodent behavior can be accounted for by two different types of behavioral algorithms, model-free or inference-based learning, that might be used to solve the task. Combining these approaches, we found that mice used a mixture of both exploratory, model-free strategies and deterministic, inference-based behavior in the task, explaining their overall noisy choice sequences. Together, our combined computational approach highlights intrinsic sources of noise in rodent reversal learning behavior and provides a richer description of behavior than conventional techniques, while uncovering the hidden states that underlie the block-by-block transitions.Author summary: Humans and animals can use diverse decision-making strategies to maximize rewards in uncertain environments, but previous studies have not investigated the use of multiple strategies that involve distinct latent switching dynamics in reward-guided behavior. Here, using a reversal learning task, we showed that mice displayed a much more variable behavior than would be expected from a uniform strategy, suggesting that they mix between multiple behavioral modes in the task. We develop a computational method to dissociate these learning modes from behavioral data, addressing the challenges faced by current analytical methods when agents mix between different strategies. We found that the use of multiple strategies is a key feature of rodent behavior even in the expert stages of learning, and applied our tools to quantify the highly diverse strategies used by individual mice in the task. We further mapped these behavioral modes to two types of underlying algorithms, model-free Q-learning and inference-based behavior. These rich descriptions of underlying latent states form the basis of detecting abnormal patterns of behavior in reward-guided decision-making.

Suggested Citation

Nhat Minh Le & Murat Yildirim & Yizhi Wang & Hiroki Sugihara & Mehrdad Jazayeri & Mriganka Sur, 2023. "Mixtures of strategies underlie rodent behavior during reversal learning," PLOS Computational Biology, Public Library of Science, vol. 19(9), pages 1-28, September.

Handle: RePEc:plo:pcbi00:1011430
DOI: 10.1371/journal.pcbi.1011430

Download full text from publisher

References listed on IDEAS

Toshihiko Hosoya & Stephen A. Baccus & Markus Meister, 2005. "Dynamic predictive coding by the retina," Nature, Nature, vol. 436(7047), pages 71-77, July.
Abhishek Banerjee & Giuseppe Parente & Jasper Teutsch & Christopher Lewis & Fabian F. Voigt & Fritjof Helmchen, 2020. "Value-guided remapping of sensory cortex by lateral orbitofrontal cortex," Nature, Nature, vol. 585(7824), pages 245-250, September.

Full references (including those not matched with items on IDEAS)

Citations

Citations are extracted by the CitEc Project, subscribe to its RSS feed for this item.

Cited by:

Bin A. Wang & Mien Brabeeba Wang & Norman H. Lam & Liu Mengxing & Shumei Li & Ralf D. Wimmer & Pedro M. Paz-Alonso & Michael M. Halassa & Burkhard Pleger, 2025. "Thalamic regulation of reinforcement learning strategies across prefrontal-striatal networks," Nature Communications, Nature, vol. 16(1), pages 1-19, December.

Most related items

These are the items that most often cite the same works as this one and are cited by the same works as this one.

Matthias S Keil & Agata Lapedriza & David Masip & Jordi Vitria, 2008. "Preferred Spatial Frequencies for Human Face Processing Are Associated with Optimal Class Discrimination in the Machine," PLOS ONE, Public Library of Science, vol. 3(7), pages 1-5, July.
Rebecca J. Rabinovich & Daniel D. Kato & Randy M. Bruno, 2022. "Learning enhances encoding of time and temporal surprise in mouse primary sensory cortex," Nature Communications, Nature, vol. 13(1), pages 1-15, December.
Wenqi Chen & Jiejunyi Liang & Qiyun Wu & Yunyun Han, 2024. "Anterior cingulate cortex provides the neural substrates for feedback-driven iteration of decision and value representation," Nature Communications, Nature, vol. 15(1), pages 1-15, December.
Matthias S Keil, 2009. "“I Look in Your Eyes, Honey”: Internal Face Features Induce Spatial Frequency Preference for Human Face Processing," PLOS Computational Biology, Public Library of Science, vol. 5(3), pages 1-13, March.
Krishnamurthy V. Vemuru, 2022. "Implementation of the Canny Edge Detector Using a Spiking Neural Network," Future Internet, MDPI, vol. 14(12), pages 1-12, December.
Masakazu Agetsuma & Issei Sato & Yasuhiro R. Tanaka & Luis Carrillo-Reid & Atsushi Kasai & Atsushi Noritake & Yoshiyuki Arai & Miki Yoshitomo & Takashi Inagaki & Hiroshi Yukawa & Hitoshi Hashimoto & J, 2023. "Activity-dependent organization of prefrontal hub-networks for associative learning and signal transformation," Nature Communications, Nature, vol. 14(1), pages 1-22, December.
Masashi Hasegawa & Ziyan Huang & Ricardo Paricio-Montesinos & Jan Gründemann, 2024. "Network state changes in sensory thalamus represent learned outcomes," Nature Communications, Nature, vol. 15(1), pages 1-14, December.
Tristan G. Heintz & Antonio J. Hinojosa & Sina E. Dominiak & Leon Lagnado, 2022. "Opposite forms of adaptation in mouse visual cortex are controlled by distinct inhibitory microcircuits," Nature Communications, Nature, vol. 13(1), pages 1-14, December.
Nicholas Cole & Matthew Harvey & Dylan Myers-Joseph & Aditya Gilra & Adil G. Khan, 2024. "Prediction-error signals in anterior cingulate cortex drive task-switching," Nature Communications, Nature, vol. 15(1), pages 1-20, December.
Zhong Ren & Xiaolu Wang & Milen Angelov & Chris I. De Zeeuw & Zhenyu Gao, 2025. "Neuronal dynamics of cerebellum and medial prefrontal cortex in adaptive motor timing," Nature Communications, Nature, vol. 16(1), pages 1-18, December.
repec:plo:pcbi00:1005718 is not listed on IDEAS
Gabriel D Puccini & Maria V Sanchez-Vives & Albert Compte, 2007. "Integrated Mechanisms of Anticipation and Rate-of-Change Computations in Cortical Circuits," PLOS Computational Biology, Public Library of Science, vol. 3(5), pages 1-13, May.
Maëlle Guyoton & Giulio Matteucci & Charlie G. Foucher & Matthew P. Getz & Julijana Gjorgjieva & Sami El-Boustani, 2025. "Cortical circuits for cross-modal generalization," Nature Communications, Nature, vol. 16(1), pages 1-23, December.
Christina Mo & Claire McKinnon & S. Murray Sherman, 2024. "A transthalamic pathway crucial for perception," Nature Communications, Nature, vol. 15(1), pages 1-13, December.
Bin A. Wang & Maike Veismann & Abhishek Banerjee & Burkhard Pleger, 2023. "Human orbitofrontal cortex signals decision outcomes to sensory cortex during behavioral adaptations," Nature Communications, Nature, vol. 14(1), pages 1-15, December.
Jonathan Schaffner & Sherry Dongqi Bao & Philippe N. Tobler & Todd A. Hare & Rafael Polania, 2023. "Sensory perception relies on fitness-maximizing codes," Nature Human Behaviour, Nature, vol. 7(7), pages 1135-1151, July.
repec:plo:pcbi00:1004315 is not listed on IDEAS
Miguel Maravall & Rasmus S Petersen & Adrienne L Fairhall & Ehsan Arabzadeh & Mathew E Diamond, 2007. "Shifts in Coding Properties and Maintenance of Information Transmission during Adaptation in Barrel Cortex," PLOS Biology, Public Library of Science, vol. 5(2), pages 1-12, January.
Tao Xie & Markus Adamek & Hohyun Cho & Matthew A. Adamo & Anthony L. Ritaccio & Jon T. Willie & Peter Brunner & Jan Kubanek, 2024. "Graded decisions in the human brain," Nature Communications, Nature, vol. 15(1), pages 1-11, December.
Shinichiro Kira & Houman Safaai & Ari S. Morcos & Stefano Panzeri & Christopher D. Harvey, 2023. "A distributed and efficient population code of mixed selectivity neurons for flexible navigation decisions," Nature Communications, Nature, vol. 14(1), pages 1-28, December.
Filippo Heimburg & Nadin Mari Saluti & Josephine Timm & Avi Adlakha & Maria Helena Bortolozzo-Gleich & Jesús Martín-Cortecero & Melina Castelanelli & Matthias Klumpp & Lee Embray & Martin Both & Thoma, 2025. "A tactile discrimination task to study neuronal dynamics in freely-moving mice," Nature Communications, Nature, vol. 16(1), pages 1-20, December.
Johnatan Aljadeff & Ronen Segev & Michael J Berry II & Tatyana O Sharpee, 2013. "Spike Triggered Covariance in Strongly Correlated Gaussian Stimuli," PLOS Computational Biology, Public Library of Science, vol. 9(9), pages 1-12, September.

More about this item

Statistics

Access and download statistics

Corrections

All material on this site has been provided by the respective publishers and authors. You can help correct errors and omissions. When requesting a correction, please mention this item's handle: RePEc:plo:pcbi00:1011430. See general information about how to correct material in RePEc.

If you have authored this item and are not yet registered with RePEc, we encourage you to do it here. This allows to link your profile to this item. It also allows you to accept potential citations to this item that we are uncertain about.

If CitEc recognized a bibliographic reference but did not link an item in RePEc to it, you can help with this form .

If you know of missing items citing this one, you can help us creating those links by adding the relevant references in the same way as above, for each refering item. If you are a registered author of this item, you may also want to check the "citations" tab in your RePEc Author Service profile, as there may be some citations waiting for confirmation.

For technical questions regarding this item, or to correct its authors, title, abstract, bibliographic or download information, contact: ploscompbiol (email available below). General contact details of provider: https://journals.plos.org/ploscompbiol/ .

Please note that corrections may take a couple of weeks to filter through the various RePEc services.

IDEAS is a RePEc service. RePEc uses bibliographic data supplied by the respective publishers.

Browse Econ Literature

More features

Mixtures of strategies underlie rodent behavior during reversal learning

Author

Abstract

Suggested Citation

Download full text from publisher

References listed on IDEAS

Citations

Most related items

More about this item

Statistics

Corrections

More services and features

MyIDEAS

Author registration

Rankings

RePEc Genealogy

RePEc Biblio

MPRA

New papers by email

EconAcademics

Plagiarism

About RePEc

RePEc home

Blog

Help/FAQ

RePEc team

Participating archives

Privacy statement

Help us

Corrections

Volunteers

Get papers listed

Open a RePEc archive

Get RePEc data