Equilibrium in Misspecified Markov Decision Processes

Equilibrium in Misspecified Markov Decision Processes

Author

Listed:

Ignacio Esponda
Demian Pouzo

Registered:

Ignacio Esponda

Abstract

We study Markov decision problems where the agent does not know the transition probability function mapping current states and actions to future states. The agent has a prior belief over a set of possible transition functions and updates beliefs using Bayes' rule. We allow her to be misspecified in the sense that the true transition probability function is not in the support of her prior. This problem is relevant in many economic settings but is usually not amenable to analysis by the researcher. We make the problem tractable by studying asymptotic behavior. We propose an equilibrium notion and provide conditions under which it characterizes steady state behavior. In the special case where the problem is static, equilibrium coincides with the single-agent version of Berk-Nash equilibrium (Esponda and Pouzo (2016)). We also discuss subtle issues that arise exclusively in dynamic settings due to the possibility of a negative value of experimentation.

Suggested Citation

Ignacio Esponda & Demian Pouzo, 2015. "Equilibrium in Misspecified Markov Decision Processes," Papers 1502.06901, arXiv.org, revised May 2016.

Handle: RePEc:arx:papers:1502.06901

Download full text from publisher

Other versions of this item:

Esponda, Ignacio & Pouzo, Demian, 2021. "Equilibrium in misspecified Markov decision processes," Theoretical Economics, Econometric Society, vol. 16(2), May.

References listed on IDEAS

Joshua Schwartzstein, 2014. "Selective Attention And Learning," Journal of the European Economic Association, European Economic Association, vol. 12(6), pages 1423-1452, December.
Fildes, Robert, 1986. "Sensitivity analyses would help : Edward E. Learner, American Economic Review 75 (1985) 308-313," International Journal of Forecasting, Elsevier, vol. 2(2), pages 237-238.
Dekel, Eddie & Fudenberg, Drew & Levine, David K., 2004. "Learning to play Bayesian games," Games and Economic Behavior, Elsevier, vol. 46(2), pages 282-303, February.
- Eddie Dekel & Drew Fudenberg & David K. Levine, 2000. "Learning to Play Bayesian Games," Discussion Papers 1322, Northwestern University, Center for Mathematical Studies in Economics and Management Science, revised Jul 2001.
- Eddie Dekel & Drew Fudenberg & David K Levine, 2002. "Learning to Play Bayesian Games," Levine's Working Paper Archive 625018000000000151, David K. Levine.
- Eddie Dekel & Drew Fudenberg & David K. Levine, 2001. "Learning to Play Bayesian Games," Harvard Institute of Economic Research Working Papers 1926, Harvard - Institute of Economic Research.
- Dekel, Eddie & Fudenberg, Drew & Levine, David, 2004. "Learning to Play Bayesian Games," Scholarly Articles 3200612, Harvard University Department of Economics.
Fudenberg, Drew & Levine, David K, 1993. "Steady State Learning and Nash Equilibrium," Econometrica, Econometric Society, vol. 61(3), pages 547-573, May.
- Drew Fudenberg & David K. Levine, 1993. "Steady State Learning and Nash Equilibrium," Levine's Working Paper Archive 373, David K. Levine.
Blume, Lawrence E. & Easley, David, 1982. "Learning to be rational," Journal of Economic Theory, Elsevier, vol. 26(2), pages 340-351, April.
Barberis, Nicholas & Shleifer, Andrei & Vishny, Robert, 1998. "A model of investor sentiment," Journal of Financial Economics, Elsevier, vol. 49(3), pages 307-343, September.
- Nicholas Barberis & Andrei Shleifer & Robert W. Vishny, 1997. "A Model of Investor Sentiment," NBER Working Papers 5926, National Bureau of Economic Research, Inc.
- Barberis, Nicholas & Shleifer, Andrei & Vishny, Robert, 1998. "A Model of Investor Sentiment," Scholarly Articles 30747159, Harvard University Department of Economics.
Nabil I. Al-Najjar, 2009. "Decision Makers as Statisticians: Diversity, Ambiguity, and Learning," Econometrica, Econometric Society, vol. 77(5), pages 1371-1401, September.
Fudenberg, Drew & Levine, David K, 1993. "Self-Confirming Equilibrium," Econometrica, Econometric Society, vol. 61(3), pages 523-545, May.
- Fudenberg, D. & Levine, D.K., 1991. "Self-Confirming Equilibrium ," Working papers 581, Massachusetts Institute of Technology (MIT), Department of Economics.
- Drew Fudenberg & David K. Levine, 1993. "Self-Confirming Equilibrium," Levine's Working Paper Archive 2147, David K. Levine.
Erik Eyster & Matthew Rabin, 2005. "Cursed Equilibrium," Econometrica, Econometric Society, vol. 73(5), pages 1623-1672, September.
- Eyster, Erik & Rabin, Matthew, 2002. "Cursed Equilibrium," Department of Economics, Working Paper Series qt7p2911dn, Department of Economics, Institute for Business and Economic Research, UC Berkeley.
- Eyster, Erik & Rabin, Matt, 2002. "Cursed Equilibrium," Department of Economics, Working Paper Series qt6xf4782t, Department of Economics, Institute for Business and Economic Research, UC Berkeley.
- Erik Eyster & Matt Rabin, 2003. "Cursed Equilibrium," Method and Hist of Econ Thought 0303002, University Library of Munich, Germany.
Michele Piccione & Ariel Rubinstein, 2003. "Modeling the Economic Interaction of Agents With Diverse Abilities to Recognize Equilibrium Patterns," Journal of the European Economic Association, MIT Press, vol. 1(1), pages 212-223, March.
- Michele Piccione & Ariel Rubinstein, 2002. "Modelling the Economic Interaction of Agents with Diverse Abilities to Recognise Equilibrium Patterns," STICERD - Theoretical Economics Paper Series 440, Suntory and Toyota International Centres for Economics and Related Disciplines, LSE.
- Michele Piccione & Ariel Rubinstein, 2010. "Modeling the Economic Interaction of Agents with Diverse Abilities to Recognize Equilibrium Patterns," Levine's Working Paper Archive 506439000000000108, David K. Levine.
- Piccione, Michele & Rubinstein, Ariel, 2002. "Modelling the economic interaction of agents with diverse abilities to recognise equilibrium patterns," LSE Research Online Documents on Economics 2061, London School of Economics and Political Science, LSE Library.
Jehiel, Philippe, 2005. "Analogy-based expectation equilibrium," Journal of Economic Theory, Elsevier, vol. 123(2), pages 81-104, August.
- Philippe Jeniel, 2001. "Analogy-Based Expectation Equilibrium," Economics Working Papers 0003, Institute for Advanced Study, School of Social Science.
- Philippe Jehiel, 2005. "Analogy-Based Expectation Equilibrium," Levine's Bibliography 784828000000000106, UCLA Department of Economics.
- Philippe Jehiel, 2005. "Analogy-based Expectation Equilibrium," Post-Print halshs-00754070, HAL.
Bray, Margaret, 1982. "Learning, estimation, and the stability of rational expectations," Journal of Economic Theory, Elsevier, vol. 26(2), pages 318-339, April.
Rothschild, Michael, 1974. "A two-armed bandit theory of market pricing," Journal of Economic Theory, Elsevier, vol. 9(2), pages 185-202, October.
Philippe Aghion & Patrick Bolton & Christopher Harris & Bruno Jullien, 1991. "Optimal Learning by Experimentation," The Review of Economic Studies, Review of Economic Studies Ltd, vol. 58(4), pages 621-654.
- Aghion, P. & Bolton, P. & Harris, C. & Jullien, B., 1990. "Optimal Learning By Experimentation," DELTA Working Papers 90-10, DELTA (Ecole normale supérieure).
- Aghion Philippe & Bolton, Patrick & Harris Christopher & Jullien Bruno, 1991. "Optimal learning by experimentation," CEPREMAP Working Papers (Couverture Orange) 9104, CEPREMAP.
McLennan, Andrew, 1984. "Price dispersion and incomplete learning in the long run," Journal of Economic Dynamics and Control, Elsevier, vol. 7(3), pages 331-347, September.
Nyarko, Yaw, 1991. "Learning in mis-specified models and the possibility of cycles," Journal of Economic Theory, Elsevier, vol. 55(2), pages 416-427, December.
- Nyarko, Yaw, 1990. "Learning In Mis-Specified Models And The Possibility Of Cycles," Working Papers 90-03, C.V. Starr Center for Applied Economics, New York University.
Blume, Lawrence E. & Easley, David, 1984. "Rational expectations equilibrium: An alternative approach," Journal of Economic Theory, Elsevier, vol. 34(1), pages 116-129, October.
, & ,, 2010. "A theory of regular Markov perfect equilibria in dynamic stochastic games: genericity, stability, and purification," Theoretical Economics, Econometric Society, vol. 5(3), September.
- Doraszelski, Ulrich & Escobar, Juan, 2008. "A Theory of Regular Markov Perfect Equilibria in Dynamic Stochastic Games: Genericity, Stability, and Purification," CEPR Discussion Papers 6805, C.E.P.R. Discussion Papers.
- Juan Escobar & Ulrich Doraszelski, 2008. "A Theory of Regular Markov Perfect Equilibria\\in Dynamic Stochastic Games: Genericity, Stability, and Purification," 2008 Meeting Papers 453, Society for Economic Dynamics.
Fudenberg Drew & Kreps David M., 1993. "Learning Mixed Equilibria," Games and Economic Behavior, Elsevier, vol. 5(3), pages 320-367, July.
- Fudenberg, D. & Kreps, D.M., 1992. "Learning Mixed Equilibria," Working papers 92-13, Massachusetts Institute of Technology (MIT), Department of Economics.
- Drew Fudenberg & David Kreps, 2010. "Learning Mixed Equilibria," Levine's Working Paper Archive 415, David K. Levine.
Enriqueta Aragones & Itzhak Gilboa & Andrew Postlewaite & David Schmeidler, 2012. "Fact-Free Learning," World Scientific Book Chapters, in: Case-Based Predictions An Axiomatic Approach to Prediction, Classification and Statistical Learning, chapter 8, pages 185-210, World Scientific Publishing Co. Pte. Ltd..
- Enriqueta Aragones & Itzhak Gilboa & Andrew Postlewaite & David Schmeidler, 2005. "Fact-Free Learning," American Economic Review, American Economic Association, vol. 95(5), pages 1355-1368, December.
- Enriqueta Aragones & Itzhak Gilboa & Andrew Postlewaite & David Schmeidler, 2003. "Fact-Free Learning," PIER Working Paper Archive 03-023, Penn Institute for Economic Research, Department of Economics, University of Pennsylvania.
- Enriqueta Aragones & Itzhak Gilboa & Andrew Postlewaite & David Schmeidler, 2004. "Fact-Free Learning," Cowles Foundation Discussion Papers 1491, Cowles Foundation for Research in Economics, Yale University.
- Enriqueta Aragones & Itzhak Gilboa & Andrew Postlewaite & David Schmeidler, 2003. "Fact-Free Learning," PIER Working Paper Archive 05-002, Penn Institute for Economic Research, Department of Economics, University of Pennsylvania, revised 01 Dec 2004.
- Itzhak Gilboa & Enriqueta Aragones & Andrew Postlewaite & David Schmeidler, 2005. "Fact-Free Learning," Post-Print hal-00481243, HAL.
Kalai, Ehud & Lehrer, Ehud, 1993. "Rational Learning Leads to Nash Equilibrium," Econometrica, Econometric Society, vol. 61(5), pages 1019-1045, September.
- Ehud Kalai & Ehud Lehrer, 1990. "Rational Learning Leads to Nash Equilibrium," Discussion Papers 895, Northwestern University, Center for Mathematical Studies in Economics and Management Science.
- Kalai, Ehud & Lehrer, Ehud, 1991. "Rational Learning Leads to Nash Equilibrium," Working Papers 91-18, C.V. Starr Center for Applied Economics, New York University.
- E. Kalai & E. Lehrer, 2010. "Rational Learning Leads to Nash Equilibrium," Levine's Working Paper Archive 529, David K. Levine.
- Ehud Kalai & Ehud Lehrer, 1990. "Rational Learning Leads to Nash Equilibrium," Discussion Papers 925, Northwestern University, Center for Mathematical Studies in Economics and Management Science.
Sobel, Joel, 1984. "Non-linear prices and price-taking behavior," Journal of Economic Behavior & Organization, Elsevier, vol. 5(3-4), pages 387-396.
Osborne, Martin J & Rubinstein, Ariel, 1998. "Games with Procedurally Rational Players," American Economic Review, American Economic Association, vol. 88(4), pages 834-847, September.
- Osborne, M-J & Rubinstein, A, 1997. "Games with Procedurally Rational Players," Papers 4-97, Tel Aviv.
- Martin J. Osborne & Ariel Rubinstein, 1997. "Games with Procedurally Rational Players," Department of Economics Working Papers 1997-02, McMaster University.
Ignacio Esponda, 2008. "Behavioral Equilibrium in Economies with Adverse Selection," American Economic Review, American Economic Association, vol. 98(4), pages 1269-1291, September.
, & ,, 2007. "Valuation equilibrium," Theoretical Economics, Econometric Society, vol. 2(2), June.
- Philippe Jehiel & Dov Samet, 2003. "Valuation Equilibria," Game Theory and Information 0310003, University Library of Munich, Germany.
- Philippe Jehiel & Dov Samet, 2007. "Valuation Equilibrium," PSE-Ecole d'économie de Paris (Postprint) halshs-00754229, HAL.
- Philippe Jehiel & Dov Samet, 2007. "Valuation Equilibrium," Post-Print halshs-00754229, HAL.
- Philippe Jehiel & Dov Samet, 2006. "Valuation Equilibria," Levine's Bibliography 784828000000000111, UCLA Department of Economics.
- Philippe Jehiel & Dov Samet, 2003. "Valuation Equilibria," Levine's Bibliography 666156000000000046, UCLA Department of Economics.

Full references (including those not matched with items on IDEAS)

Citations

Citations are extracted by the CitEc Project, subscribe to its RSS feed for this item.

Cited by:

Ignacio Esponda & Demian Pouzo, 2026. "Learning and Equilibrium under Model Misspecification," Papers 2601.09891, arXiv.org.
Yingkai Li & Aleksandrs Slivkins, 2022. "Exploration and Incentivizing Participation in Randomized Trials," Papers 2202.06191, arXiv.org, revised Jan 2026.
Thomas J. Sargent & John Stachurski, 2024. "Dynamic Programming: Finite States," Papers 2401.10473, arXiv.org.
Esponda, Ignacio & Pouzo, Demian & Yamamoto, Yuichi, 2021. "Asymptotic behavior of Bayesian learners with misspecified models," Journal of Economic Theory, Elsevier, vol. 195(C).
- Ignacio Esponda & Demian Pouzo & Yuichi Yamamoto, 2019. "Asymptotic Behavior of Bayesian Learners with Misspecified Models," Papers 1904.08551, arXiv.org, revised Oct 2019.
Anderson, Robert M. & Duanmu, Haosui & Ghosh, Aniruddha & Khan, M. Ali, 2024. "On existence of Berk-Nash equilibria in misspecified Markov decision processes with infinite spaces," Journal of Economic Theory, Elsevier, vol. 217(C).
- Robert M. Anderson & Haosui Duanmu & Aniruddha Ghosh & M. Ali Khan, 2022. "On Existence of Berk-Nash Equilibria in Misspecified Markov Decision Processes with Infinite Spaces," Papers 2206.08437, arXiv.org, revised Jul 2023.
Fudenberg, Drew & Romanyuk, Gleb & Strack, Philipp, 2017. "Active learning with a misspecified prior," Theoretical Economics, Econometric Society, vol. 12(3), September.

Most related items

These are the items that most often cite the same works as this one and are cited by the same works as this one.

Ignacio Esponda & Demian Pouzo, 2016. "Berk–Nash Equilibrium: A Framework for Modeling Agents With Misspecified Models," Econometrica, Econometric Society, vol. 84, pages 1093-1130, May.
- Ignacio Esponda & Demian Pouzo, 2014. "Berk-Nash Equilibrium: A Framework for Modeling Agents with Misspecified Models," Papers 1411.1152, arXiv.org, revised Nov 2019.
Esponda, Ignacio & Pouzo, Demian & Yamamoto, Yuichi, 2021. "Asymptotic behavior of Bayesian learners with misspecified models," Journal of Economic Theory, Elsevier, vol. 195(C).
- Ignacio Esponda & Demian Pouzo & Yuichi Yamamoto, 2019. "Asymptotic Behavior of Bayesian Learners with Misspecified Models," Papers 1904.08551, arXiv.org, revised Oct 2019.
Philippe Jehiel, 2022. "Analogy-Based Expectation Equilibrium and Related Concepts:Theory, Applications, and Beyond," Working Papers halshs-03735680, HAL.
- Philippe Jehiel, 2022. "Analogy-Based Expectation Equilibrium and Related Concepts:Theory, Applications, and Beyond," PSE Working Papers halshs-03735680, HAL.
Fudenberg, Drew & Romanyuk, Gleb & Strack, Philipp, 2017. "Active learning with a misspecified prior," Theoretical Economics, Econometric Society, vol. 12(3), September.
Christoph March, 2011. "Adaptive social learning," Working Papers halshs-00572528, HAL.
- Christoph March, 2016. "Adaptive Social Learning," CESifo Working Paper Series 5783, CESifo.
- Christoph March, 2011. "Adaptive social learning," PSE Working Papers halshs-00572528, HAL.
Topi Miettinen, 2012. "Paying attention to payoffs in analogy-based learning," Economic Theory, Springer;Society for the Advancement of Economic Theory (SAET), vol. 50(1), pages 193-222, May.
- Miettinen, Topi, 2009. "Paying Attention to Payoffs in Analogy-Based Learning," SITE Working Paper Series 7, Stockholm School of Economics, Stockholm Institute of Transition Economics.
Mario Gilli, 2002. "Rational Learning in Imperfect Monitoring Games," Working Papers 46, University of Milano-Bicocca, Department of Economics, revised Mar 2002.
Manxi Wu & Saurabh Amin & Asuman Ozdaglar, 2021. "Multi-agent Bayesian Learning with Best Response Dynamics: Convergence and Stability," Papers 2109.00719, arXiv.org.
Sobel, Joel, 2000. "Economists' Models of Learning," Journal of Economic Theory, Elsevier, vol. 94(2), pages 241-261, October.
Cristián Sánchez, 2025. "Equilibrium Consequences of Vouchers Under Simultaneous Extensive and Intensive Margins Competition," Working Papers Central Bank of Chile 1038, Central Bank of Chile.
Ran Spiegler, 2016. "Bayesian Networks and Boundedly Rational Expectations," The Quarterly Journal of Economics, President and Fellows of Harvard College, vol. 131(3), pages 1243-1290.
- Spiegler, Ran, "undated". "Bayesian Networks and Boundedly Rational Expectations," Foerder Institute for Economic Research Working Papers 275828, Tel-Aviv University > Foerder Institute for Economic Research.
- Spiegler, Ran, 2014. "Bayesian networks and boundedly rational expectations," LSE Research Online Documents on Economics 57994, London School of Economics and Political Science, LSE Library.
- Ran Spiegler, 2014. "Bayesian Networks and Boundedly Rational Expectations," Discussion Papers 1417, Centre for Macroeconomics (CFM).
- Spiegler, Ran, 2014. "Bayesian Networks and Boundedly Rational Expectations," CEPR Discussion Papers 10062, Centre for Economic Policy Research.
Philippe Jehiel & Erik Mohlin, 2023. "Categorization in Games: A Bias-Variance Perspective," Working Papers halshs-04154272, HAL.
- Jehiel, Philippe & Mohlin, Erik, 2025. "Categorization in Games: A Bias-Variance Perspective," Working Papers 2025:7, Lund University, Department of Economics.
Miettinen, Topi, 2009. "The partially cursed and the analogy-based expectation equilibrium," Economics Letters, Elsevier, vol. 105(2), pages 162-164, November.
Mira Frick & Ryota Iijima & Yuhta Ishii, 2020. "Belief Convergence under Misspecified Learning: A Martingale Approach," Cowles Foundation Discussion Papers 2235R, Cowles Foundation for Research in Economics, Yale University, revised Mar 2021.
- Mira Frick & Ryota Iijima & Yuhta Ishii, 2020. "Belief Convergence under Misspecified Learning: A Martingale Approach," Cowles Foundation Discussion Papers 2235R2, Cowles Foundation for Research in Economics, Yale University, revised Dec 2021.
- Frick, Mira & , & Ishii, Yuhta, 2021. "Belief Convergence under Misspecified Learning: A Martingale Approach," CEPR Discussion Papers 16788, C.E.P.R. Discussion Papers.
Zacharias Maniadis, 2014. "Selective revelation of public information and self-confirming equilibrium," International Journal of Game Theory, Springer;Game Theory Society, vol. 43(4), pages 991-1008, November.
Ignacio Esponda & Demian Pouzo, 2026. "Learning and Equilibrium under Model Misspecification," Papers 2601.09891, arXiv.org.
Mira Frick & Ryota Iijima & Yuhta Ishii, 2020. "Belief Convergence under Misspecified Learning: A Martingale Approach," Cowles Foundation Discussion Papers 2235R3, Cowles Foundation for Research in Economics, Yale University, revised Apr 2022.
Liu, Zhen, 2016. "Games with incomplete information when players are partially aware of others’ signals," Journal of Mathematical Economics, Elsevier, vol. 65(C), pages 58-70.
Mira Frick & Ryota Iijima & Yuhta Ishii, 2020. "Stability and Robustness in Misspecified Learning Models," Cowles Foundation Discussion Papers 2235, Cowles Foundation for Research in Economics, Yale University.
S. Nageeb Ali, 2011. "Learning Self-Control," The Quarterly Journal of Economics, President and Fellows of Harvard College, vol. 126(2), pages 857-893.
- S. Nageeb Ali, 2009. "Learning Self-Control," Levine's Working Paper Archive 814577000000000384, David K. Levine.

More about this item

JEL classification:

C61 - Mathematical and Quantitative Methods - - Mathematical Methods; Programming Models; Mathematical and Simulation Modeling - - - Optimization Techniques; Programming Models; Dynamic Analysis
D83 - Microeconomics - - Information, Knowledge, and Uncertainty - - - Search; Learning; Information and Knowledge; Communication; Belief; Unawareness

NEP fields

This paper has been announced in the following NEP Reports:

NEP-MIC-2015-02-28 (Microeconomics)

Statistics

Access and download statistics

Corrections

All material on this site has been provided by the respective publishers and authors. You can help correct errors and omissions. When requesting a correction, please mention this item's handle: RePEc:arx:papers:1502.06901. See general information about how to correct material in RePEc.

If you have authored this item and are not yet registered with RePEc, we encourage you to do it here. This allows to link your profile to this item. It also allows you to accept potential citations to this item that we are uncertain about.

If CitEc recognized a bibliographic reference but did not link an item in RePEc to it, you can help with this form .

If you know of missing items citing this one, you can help us creating those links by adding the relevant references in the same way as above, for each refering item. If you are a registered author of this item, you may also want to check the "citations" tab in your RePEc Author Service profile, as there may be some citations waiting for confirmation.

For technical questions regarding this item, or to correct its authors, title, abstract, bibliographic or download information, contact: arXiv administrators (email available below). General contact details of provider: http://arxiv.org/ .

Please note that corrections may take a couple of weeks to filter through the various RePEc services.

IDEAS is a RePEc service. RePEc uses bibliographic data supplied by the respective publishers.

Browse Econ Literature

More features

Equilibrium in Misspecified Markov Decision Processes

Author

Abstract

Suggested Citation

Download full text from publisher

Other versions of this item:

References listed on IDEAS

Citations

Most related items

More about this item

JEL classification:

NEP fields

Statistics

Corrections

More services and features

MyIDEAS

Author registration

Rankings

RePEc Genealogy

RePEc Biblio

MPRA

New papers by email

EconAcademics

Plagiarism

About RePEc

RePEc home

Blog

Help/FAQ

RePEc team

Participating archives

Privacy statement

Help us

Corrections

Volunteers

Get papers listed

Open a RePEc archive

Get RePEc data