IDEAS home Printed from https://ideas.repec.org/p/arx/papers/1502.06901.html
   My bibliography  Save this paper

Equilibrium in Misspecified Markov Decision Processes

Author

Listed:
  • Ignacio Esponda
  • Demian Pouzo

Abstract

We study Markov decision problems where the agent does not know the transition probability function mapping current states and actions to future states. The agent has a prior belief over a set of possible transition functions and updates beliefs using Bayes' rule. We allow her to be misspecified in the sense that the true transition probability function is not in the support of her prior. This problem is relevant in many economic settings but is usually not amenable to analysis by the researcher. We make the problem tractable by studying asymptotic behavior. We propose an equilibrium notion and provide conditions under which it characterizes steady state behavior. In the special case where the problem is static, equilibrium coincides with the single-agent version of Berk-Nash equilibrium (Esponda and Pouzo (2016)). We also discuss subtle issues that arise exclusively in dynamic settings due to the possibility of a negative value of experimentation.

Suggested Citation

  • Ignacio Esponda & Demian Pouzo, 2015. "Equilibrium in Misspecified Markov Decision Processes," Papers 1502.06901, arXiv.org, revised May 2016.
  • Handle: RePEc:arx:papers:1502.06901
    as

    Download full text from publisher

    File URL: http://arxiv.org/pdf/1502.06901
    File Function: Latest version
    Download Restriction: no
    ---><---

    References listed on IDEAS

    as
    1. Joshua Schwartzstein, 2014. "Selective Attention And Learning," Journal of the European Economic Association, European Economic Association, vol. 12(6), pages 1423-1452, December.
    2. Fildes, Robert, 1986. "Sensitivity analyses would help : Edward E. Learner, American Economic Review 75 (1985) 308-313," International Journal of Forecasting, Elsevier, vol. 2(2), pages 237-238.
    3. Philippe Aghion & Patrick Bolton & Christopher Harris & Bruno Jullien, 1991. "Optimal Learning by Experimentation," The Review of Economic Studies, Review of Economic Studies Ltd, vol. 58(4), pages 621-654.
    4. Blume, Lawrence E. & Easley, David, 1982. "Learning to be rational," Journal of Economic Theory, Elsevier, vol. 26(2), pages 340-351, April.
    5. , & ,, 2007. "Valuation equilibrium," Theoretical Economics, Econometric Society, vol. 2(2), June.
    6. Nyarko, Yaw, 1991. "Learning in mis-specified models and the possibility of cycles," Journal of Economic Theory, Elsevier, vol. 55(2), pages 416-427, December.
    7. Fudenberg, Drew & Levine, David K, 1993. "Self-Confirming Equilibrium," Econometrica, Econometric Society, vol. 61(3), pages 523-545, May.
    8. Erik Eyster & Matthew Rabin, 2005. "Cursed Equilibrium," Econometrica, Econometric Society, vol. 73(5), pages 1623-1672, September.
    9. Michele Piccione & Ariel Rubinstein, 2003. "Modeling the Economic Interaction of Agents With Diverse Abilities to Recognize Equilibrium Patterns," Journal of the European Economic Association, MIT Press, vol. 1(1), pages 212-223, March.
    10. Jehiel, Philippe, 2005. "Analogy-based expectation equilibrium," Journal of Economic Theory, Elsevier, vol. 123(2), pages 81-104, August.
    11. Rothschild, Michael, 1974. "A two-armed bandit theory of market pricing," Journal of Economic Theory, Elsevier, vol. 9(2), pages 185-202, October.
    12. Kalai, Ehud & Lehrer, Ehud, 1993. "Rational Learning Leads to Nash Equilibrium," Econometrica, Econometric Society, vol. 61(5), pages 1019-1045, September.
    13. Dekel, Eddie & Fudenberg, Drew & Levine, David K., 2004. "Learning to play Bayesian games," Games and Economic Behavior, Elsevier, vol. 46(2), pages 282-303, February.
    14. Enriqueta Aragones & Itzhak Gilboa & Andrew Postlewaite & David Schmeidler, 2012. "Fact-Free Learning," World Scientific Book Chapters, in: Case-Based Predictions An Axiomatic Approach to Prediction, Classification and Statistical Learning, chapter 8, pages 185-210, World Scientific Publishing Co. Pte. Ltd..
    15. Fudenberg, Drew & Levine, David K, 1993. "Steady State Learning and Nash Equilibrium," Econometrica, Econometric Society, vol. 61(3), pages 547-573, May.
    16. Barberis, Nicholas & Shleifer, Andrei & Vishny, Robert, 1998. "A model of investor sentiment," Journal of Financial Economics, Elsevier, vol. 49(3), pages 307-343, September.
    17. McLennan, Andrew, 1984. "Price dispersion and incomplete learning in the long run," Journal of Economic Dynamics and Control, Elsevier, vol. 7(3), pages 331-347, September.
    18. Ignacio Esponda, 2008. "Behavioral Equilibrium in Economies with Adverse Selection," American Economic Review, American Economic Association, vol. 98(4), pages 1269-1291, September.
    19. Fudenberg Drew & Kreps David M., 1993. "Learning Mixed Equilibria," Games and Economic Behavior, Elsevier, vol. 5(3), pages 320-367, July.
    20. Osborne, Martin J & Rubinstein, Ariel, 1998. "Games with Procedurally Rational Players," American Economic Review, American Economic Association, vol. 88(4), pages 834-847, September.
    21. , & ,, 2010. "A theory of regular Markov perfect equilibria in dynamic stochastic games: genericity, stability, and purification," Theoretical Economics, Econometric Society, vol. 5(3), September.
    22. Bray, Margaret, 1982. "Learning, estimation, and the stability of rational expectations," Journal of Economic Theory, Elsevier, vol. 26(2), pages 318-339, April.
    23. Blume, Lawrence E. & Easley, David, 1984. "Rational expectations equilibrium: An alternative approach," Journal of Economic Theory, Elsevier, vol. 34(1), pages 116-129, October.
    24. Sobel, Joel, 1984. "Non-linear prices and price-taking behavior," Journal of Economic Behavior & Organization, Elsevier, vol. 5(3-4), pages 387-396.
    25. Nabil I. Al-Najjar, 2009. "Decision Makers as Statisticians: Diversity, Ambiguity, and Learning," Econometrica, Econometric Society, vol. 77(5), pages 1371-1401, September.
    Full references (including those not matched with items on IDEAS)

    Citations

    Citations are extracted by the CitEc Project, subscribe to its RSS feed for this item.
    as


    Cited by:

    1. Ignacio Esponda & Demian Pouzo & Yuichi Yamamoto, 2019. "Asymptotic Behavior of Bayesian Learners with Misspecified Models," Papers 1904.08551, arXiv.org, revised Oct 2019.
    2. Fudenberg, Drew & Romanyuk, Gleb & Strack, Philipp, 2017. "Active learning with a misspecified prior," Theoretical Economics, Econometric Society, vol. 12(3), September.

    Most related items

    These are the items that most often cite the same works as this one and are cited by the same works as this one.
    1. Ignacio Esponda & Demian Pouzo, 2014. "Berk-Nash Equilibrium: A Framework for Modeling Agents with Misspecified Models," Papers 1411.1152, arXiv.org, revised Nov 2019.
    2. Esponda, Ignacio & Pouzo, Demian & Yamamoto, Yuichi, 2021. "Asymptotic behavior of Bayesian learners with misspecified models," Journal of Economic Theory, Elsevier, vol. 195(C).
    3. Ignacio Esponda & Demian Pouzo & Yuichi Yamamoto, 2019. "Asymptotic Behavior of Bayesian Learners with Misspecified Models," Papers 1904.08551, arXiv.org, revised Oct 2019.
    4. Philippe Jehiel, 2022. "Analogy-Based Expectation Equilibrium and Related Concepts:Theory, Applications, and Beyond," Working Papers halshs-03735680, HAL.
    5. Topi Miettinen, 2012. "Paying attention to payoffs in analogy-based learning," Economic Theory, Springer;Society for the Advancement of Economic Theory (SAET), vol. 50(1), pages 193-222, May.
    6. Fudenberg, Drew & Romanyuk, Gleb & Strack, Philipp, 2017. "Active learning with a misspecified prior," Theoretical Economics, Econometric Society, vol. 12(3), September.
    7. Christoph March, 2011. "Adaptive social learning," Working Papers halshs-00572528, HAL.
    8. Mario Gilli, 2002. "Rational Learning in Imperfect Monitoring Games," Working Papers 46, University of Milano-Bicocca, Department of Economics, revised Mar 2002.
    9. Jean-Michel Grandmont, 1998. "Expectations Formation and Stability of Large Socioeconomic Systems," Econometrica, Econometric Society, vol. 66(4), pages 741-782, July.
    10. Manxi Wu & Saurabh Amin & Asuman Ozdaglar, 2021. "Multi-agent Bayesian Learning with Best Response Dynamics: Convergence and Stability," Papers 2109.00719, arXiv.org.
    11. Sobel, Joel, 2000. "Economists' Models of Learning," Journal of Economic Theory, Elsevier, vol. 94(2), pages 241-261, October.
    12. Ran Spiegler, 2016. "Bayesian Networks and Boundedly Rational Expectations," The Quarterly Journal of Economics, President and Fellows of Harvard College, vol. 131(3), pages 1243-1290.
    13. S. Nageeb Ali, 2011. "Learning Self-Control," The Quarterly Journal of Economics, President and Fellows of Harvard College, vol. 126(2), pages 857-893.
    14. Philippe Jehiel & Erik Mohlin, 2023. "Categorization in Games: A Bias-Variance Perspective," Working Papers halshs-04154272, HAL.
    15. Miettinen, Topi, 2009. "The partially cursed and the analogy-based expectation equilibrium," Economics Letters, Elsevier, vol. 105(2), pages 162-164, November.
    16. Mira Frick & Ryota Iijima & Yuhta Ishii, 2020. "Belief Convergence under Misspecified Learning: A Martingale Approach," Cowles Foundation Discussion Papers 2235R, Cowles Foundation for Research in Economics, Yale University, revised Mar 2021.
    17. Zacharias Maniadis, 2014. "Selective revelation of public information and self-confirming equilibrium," International Journal of Game Theory, Springer;Game Theory Society, vol. 43(4), pages 991-1008, November.
    18. Mira Frick & Ryota Iijima & Yuhta Ishii, 2020. "Belief Convergence under Misspecified Learning: A Martingale Approach," Cowles Foundation Discussion Papers 2235R3, Cowles Foundation for Research in Economics, Yale University, revised Apr 2022.
    19. Liu, Zhen, 2016. "Games with incomplete information when players are partially aware of others’ signals," Journal of Mathematical Economics, Elsevier, vol. 65(C), pages 58-70.
    20. Mira Frick & Ryota Iijima & Yuhta Ishii, 2020. "Stability and Robustness in Misspecified Learning Models," Cowles Foundation Discussion Papers 2235, Cowles Foundation for Research in Economics, Yale University.

    More about this item

    NEP fields

    This paper has been announced in the following NEP Reports:

    Statistics

    Access and download statistics

    Corrections

    All material on this site has been provided by the respective publishers and authors. You can help correct errors and omissions. When requesting a correction, please mention this item's handle: RePEc:arx:papers:1502.06901. See general information about how to correct material in RePEc.

    If you have authored this item and are not yet registered with RePEc, we encourage you to do it here. This allows to link your profile to this item. It also allows you to accept potential citations to this item that we are uncertain about.

    If CitEc recognized a bibliographic reference but did not link an item in RePEc to it, you can help with this form .

    If you know of missing items citing this one, you can help us creating those links by adding the relevant references in the same way as above, for each refering item. If you are a registered author of this item, you may also want to check the "citations" tab in your RePEc Author Service profile, as there may be some citations waiting for confirmation.

    For technical questions regarding this item, or to correct its authors, title, abstract, bibliographic or download information, contact: arXiv administrators (email available below). General contact details of provider: http://arxiv.org/ .

    Please note that corrections may take a couple of weeks to filter through the various RePEc services.

    IDEAS is a RePEc service. RePEc uses bibliographic data supplied by the respective publishers.