IDEAS home Printed from https://ideas.repec.org/a/eee/jetheo/v178y2018icp1-35.html
   My bibliography  Save this article

Ambiguous partially observable Markov decision processes: Structural results and applications

Author

Listed:
  • Saghafian, Soroush

Abstract

Markov Decision Processes (MDPs) have been widely used as invaluable tools in dynamic decision-making, which is a central concern for economic agents operating at both the micro and macro levels. Often the decision maker's information about the state is incomplete; hence, the generalization to Partially Observable MDPs (POMDPs). Unfortunately, POMDPs may require a large state and/or action space, creating the well-known “curse of dimensionality.” However, recent computational contributions and blindingly fast computers have helped to dispel this curse. This paper introduces and addresses a second curse termed “curse of ambiguity,” which refers to the fact that the exact transition probabilities are often hard to quantify, and are rather ambiguous. For instance, for a monetary authority concerned with dynamically setting the inflation rate so as to control the unemployment, the dynamics of unemployment rate under any given inflation rate is often ambiguous. Similarly, in worker-job matching, the dynamics of worker-job match/proficiency level is typically ambiguous. This paper addresses the “curse of ambiguity” by developing a generalization of POMDPs termed Ambiguous POMDPs (APOMDPs), which not only allows the decision maker to take into account imperfect state information, but also tackles the inevitable ambiguity with respect to the correct probabilistic model of transitions.

Suggested Citation

  • Saghafian, Soroush, 2018. "Ambiguous partially observable Markov decision processes: Structural results and applications," Journal of Economic Theory, Elsevier, vol. 178(C), pages 1-35.
  • Handle: RePEc:eee:jetheo:v:178:y:2018:i:c:p:1-35
    DOI: 10.1016/j.jet.2018.08.006
    as

    Download full text from publisher

    File URL: http://www.sciencedirect.com/science/article/pii/S0022053118304770
    Download Restriction: Full text for ScienceDirect subscribers only

    File URL: https://libkey.io/10.1016/j.jet.2018.08.006?utm_source=ideas
    LibKey link: if access is restricted and if your library uses this service, LibKey will redirect you to where you can use your library subscription to access this item
    ---><---

    As the access to this document is restricted, you may want to search for a different version of it.

    References listed on IDEAS

    as
    1. Tomasz Strzalecki, 2011. "Axiomatic Foundations of Multiplier Preferences," Econometrica, Econometric Society, vol. 79(1), pages 47-73, January.
    2. Klibanoff, Peter & Marinacci, Massimo & Mukerji, Sujoy, 2009. "Recursive smooth ambiguity preferences," Journal of Economic Theory, Elsevier, vol. 144(3), pages 930-976, May.
    3. Erick Delage & Shie Mannor, 2010. "Percentile Optimization for Markov Decision Processes with Parameter Uncertainty," Operations Research, INFORMS, vol. 58(1), pages 203-213, February.
    4. Jovanovic, Boyan & Nyarko, Yaw, 1996. "Learning by Doing and the Choice of Technology," Econometrica, Econometric Society, vol. 64(6), pages 1299-1310, November.
    5. Maccheroni, Fabio & Marinacci, Massimo & Rustichini, Aldo, 2006. "Dynamic variational preferences," Journal of Economic Theory, Elsevier, vol. 128(1), pages 4-44, May.
    6. Jovanovic, Boyan & Nyarko, Yaw, 1995. "The transfer of human capital," Journal of Economic Dynamics and Control, Elsevier, vol. 19(5-7), pages 1033-1064.
    7. Ghirardato, Paolo & Maccheroni, Fabio & Marinacci, Massimo, 2004. "Differentiating ambiguity and ambiguity attitude," Journal of Economic Theory, Elsevier, vol. 118(2), pages 133-173, October.
    8. ,, 2011. "Dynamic choice under ambiguity," Theoretical Economics, Econometric Society, vol. 6(3), September.
    9. Richard D. Smallwood & Edward J. Sondik, 1973. "The Optimal Control of Partially Observable Markov Processes over a Finite Horizon," Operations Research, INFORMS, vol. 21(5), pages 1071-1088, October.
    10. Hao Zhang & Stefanos Zenios, 2008. "A Dynamic Principal-Agent Model with Hidden Information: Sequential Optimality Through Truthful State Revelation," Operations Research, INFORMS, vol. 56(3), pages 681-696, June.
    11. Hansen, Lars Peter & Sargent, Thomas J., 2007. "Recursive robust estimation and control without commitment," Journal of Economic Theory, Elsevier, vol. 136(1), pages 1-27, September.
    12. Cooper, Russell & Haltiwanger, John, 1993. "The Aggregate Implications of Machine Replacement: Theory and Evidence," American Economic Review, American Economic Association, vol. 83(3), pages 360-382, June.
    13. Mohammed Abdellaoui & John D. Hey (ed.), 2008. "Advances in Decision Making Under Risk and Uncertainty," Theory and Decision Library C, Springer, number 978-3-540-68437-4, March.
    14. Timothy Cogley & Riccardo Colacito & Lars Peter Hansen & Thomas J. Sargent, 2008. "Robustness and U.S. Monetary Policy Experimentation," Journal of Money, Credit and Banking, Blackwell Publishing, vol. 40(8), pages 1599-1623, December.
    15. Jovanovic, Boyan, 1979. "Job Matching and the Theory of Turnover," Journal of Political Economy, University of Chicago Press, vol. 87(5), pages 972-990, October.
    16. George E. Monahan, 1982. "State of the Art---A Survey of Partially Observable Markov Decision Processes: Theory, Models, and Algorithms," Management Science, INFORMS, vol. 28(1), pages 1-16, January.
    17. Karlin, Samuel & Rinott, Yosef, 1980. "Classes of orderings of measures and related correlation inequalities II. Multivariate reverse rule distributions," Journal of Multivariate Analysis, Elsevier, vol. 10(4), pages 499-516, December.
    18. William S. Lovejoy, 1987. "Some Monotonicity Results for Partially Observed Markov Decision Processes," Operations Research, INFORMS, vol. 35(5), pages 736-743, October.
    19. Wolfram Wiesemann & Daniel Kuhn & Berç Rustem, 2013. "Robust Markov Decision Processes," Mathematics of Operations Research, INFORMS, vol. 38(1), pages 153-183, February.
    20. Edward J. Sondik, 1978. "The Optimal Control of Partially Observable Markov Processes over the Infinite Horizon: Discounted Costs," Operations Research, INFORMS, vol. 26(2), pages 282-304, April.
    21. Peter Klibanoff & Massimo Marinacci & Sujoy Mukerji, 2005. "A Smooth Model of Decision Making under Ambiguity," Econometrica, Econometric Society, vol. 73(6), pages 1849-1892, November.
    22. William S. Lovejoy, 1987. "Technical Note—On the Convexity of Policy Regions in Partially Observed Systems," Operations Research, INFORMS, vol. 35(4), pages 619-621, August.
    23. Karlin, Samuel & Rinott, Yosef, 1980. "Classes of orderings of measures and related correlation inequalities. I. Multivariate totally positive distributions," Journal of Multivariate Analysis, Elsevier, vol. 10(4), pages 467-498, December.
    24. Garud N. Iyengar, 2005. "Robust Dynamic Programming," Mathematics of Operations Research, INFORMS, vol. 30(2), pages 257-280, May.
    25. Massimo Marinacci, 2002. "Probabilistic Sophistication and Multiple Priors," Econometrica, Econometric Society, vol. 70(2), pages 755-764, March.
    26. Gilboa, Itzhak & Schmeidler, David, 1989. "Maxmin expected utility with non-unique prior," Journal of Mathematical Economics, Elsevier, vol. 18(2), pages 141-153, April.
    27. Nowak, Andrzej S. & Szajowski, Krzysztof, 1998. "Nonzero-sum Stochastic Games," MPRA Paper 19995, University Library of Munich, Germany, revised 1999.
    28. Bengt Holmstrom, 1979. "Moral Hazard and Observability," Bell Journal of Economics, The RAND Corporation, vol. 10(1), pages 74-91, Spring.
    29. Frank Riedel, 2009. "Optimal Stopping With Multiple Priors," Econometrica, Econometric Society, vol. 77(3), pages 857-908, May.
    30. Lars Peter Hansen & Thomas J Sargent, 2014. "Robust Control and Model Uncertainty," World Scientific Book Chapters, in: UNCERTAINTY WITHIN ECONOMIC MODELS, chapter 5, pages 145-154, World Scientific Publishing Co. Pte. Ltd..
    31. Lars Peter Hansen & Thomas J Sargent, 2014. "Three Types of Ambiguity," World Scientific Book Chapters, in: UNCERTAINTY WITHIN ECONOMIC MODELS, chapter 11, pages 379-430, World Scientific Publishing Co. Pte. Ltd..
    32. HOLMSTROM, Bengt, 1979. "Moral hazard and observability," LIDAM Reprints CORE 379, Université catholique de Louvain, Center for Operations Research and Econometrics (CORE).
    33. Epstein, Larry G. & Schneider, Martin, 2003. "Recursive multiple-priors," Journal of Economic Theory, Elsevier, vol. 113(1), pages 1-31, November.
    34. Jovanovic, Boyan, 1982. "Selection and the Evolution of Industry," Econometrica, Econometric Society, vol. 50(3), pages 649-670, May.
    35. Sheldon M. Ross, 1971. "Quality Control under Markovian Deterioration," Management Science, INFORMS, vol. 17(9), pages 587-596, May.
    36. Arnab Nilim & Laurent El Ghaoui, 2005. "Robust Control of Markov Decision Processes with Uncertain Transition Matrices," Operations Research, INFORMS, vol. 53(5), pages 780-798, October.
    37. Hansen, Lars Peter & Sargent, Thomas J., 2012. "Three types of ambiguity," Journal of Monetary Economics, Elsevier, vol. 59(5), pages 422-445.
    38. Yossi Aviv & Amit Pazgal, 2005. "A Partially Observed Markov Decision Process for Dynamic Pricing," Management Science, INFORMS, vol. 51(9), pages 1400-1416, September.
    39. James E. Smith & Kevin F. McCardle, 2002. "Structural Properties of Stochastic Dynamic Programs," Operations Research, INFORMS, vol. 50(5), pages 796-809, October.
    40. Soroush Saghafian & Brian Tomlin, 2016. "The Newsvendor under Demand Ambiguity: Combining Data with Moment and Tail Information," Operations Research, INFORMS, vol. 64(1), pages 167-185, February.
    41. D. P. de Farias & B. Van Roy, 2003. "The Linear Programming Approach to Approximate Dynamic Programming," Operations Research, INFORMS, vol. 51(6), pages 850-865, December.
    42. Heath, Chip & Tversky, Amos, 1991. "Preference and Belief: Ambiguity and Competence in Choice under Uncertainty," Journal of Risk and Uncertainty, Springer, vol. 4(1), pages 5-28, January.
    43. Huan Xu & Shie Mannor, 2012. "Distributionally Robust Markov Decision Processes," Mathematics of Operations Research, INFORMS, vol. 37(2), pages 288-300, May.
    44. Vikram Krishnamurthy & Bo Wahlberg, 2009. "Partially Observed Markov Decision Process Multiarmed Bandits---Structural Results," Mathematics of Operations Research, INFORMS, vol. 34(2), pages 287-302, May.
    Full references (including those not matched with items on IDEAS)

    Citations

    Citations are extracted by the CitEc Project, subscribe to its RSS feed for this item.
    as


    Cited by:

    1. Andrew J. Keith & Darryl K. Ahner, 2021. "A survey of decision making and optimization under uncertainty," Annals of Operations Research, Springer, vol. 300(2), pages 319-353, May.
    2. Frick, Mira & Iijima, Ryota & Le Yaouanq, Yves, 2022. "Objective rationality foundations for (dynamic) α-MEU," Journal of Economic Theory, Elsevier, vol. 200(C).
    3. Mira Frick & Ryota Iijima & Yves Le Yaouanq, 2020. "Objective rationality foundations for (dynamic) alpha-MEU," Cowles Foundation Discussion Papers 2244R, Cowles Foundation for Research in Economics, Yale University, revised Jul 2021.
    4. Tomasz Kosmala & Randall Martyr & John Moriarty, 2020. "Markov risk mappings and risk-sensitive optimal prediction," Papers 2001.06895, arXiv.org, revised Sep 2022.
    5. Randall Martyr & John Moriarty & Magnus Perninge, 2019. "Discrete-time risk-aware optimal switching with non-adapted costs," Papers 1910.04047, arXiv.org, revised Sep 2021.
    6. Miehling, Erik & Teneketzis, Demosthenis, 2020. "Monotonicity properties for two-action partially observable Markov decision processes on partially ordered spaces," European Journal of Operational Research, Elsevier, vol. 282(3), pages 936-944.
    7. Alireza Boloori & Soroush Saghafian & Harini A. Chakkera & Curtiss B. Cook, 2020. "Data-Driven Management of Post-transplant Medications: An Ambiguous Partially Observable Markov Decision Process Approach," Manufacturing & Service Operations Management, INFORMS, vol. 22(5), pages 1066-1087, September.
    8. Maximilian Blesch & Philipp Eisenhauer, 2021. "Robust Decision-Making Under Risk and Ambiguity," ECONtribute Discussion Papers Series 104, University of Bonn and University of Cologne, Germany.

    Most related items

    These are the items that most often cite the same works as this one and are cited by the same works as this one.
    1. Rasouli, Mohammad & Saghafian, Soroush, 2018. "Robust Partially Observable Markov Decision Processes," Working Paper Series rwp18-027, Harvard University, John F. Kennedy School of Government.
    2. Andrew J. Keith & Darryl K. Ahner, 2021. "A survey of decision making and optimization under uncertainty," Annals of Operations Research, Springer, vol. 300(2), pages 319-353, May.
    3. Spyros Galanis, 2021. "Dynamic consistency, valuable information and subjective beliefs," Economic Theory, Springer;Society for the Advancement of Economic Theory (SAET), vol. 71(4), pages 1467-1497, June.
    4. Cerreia-Vioglio, S. & Maccheroni, F. & Marinacci, M. & Montrucchio, L., 2011. "Uncertainty averse preferences," Journal of Economic Theory, Elsevier, vol. 146(4), pages 1275-1330, July.
    5. Gumen, Anna & Savochkin, Andrei, 2013. "Dynamically stable preferences," Journal of Economic Theory, Elsevier, vol. 148(4), pages 1487-1508.
    6. Daniele Pennesi, 2013. "Asset Prices in an Ambiguous Economy," Carlo Alberto Notebooks 315, Collegio Carlo Alberto.
    7. Karni, Edi & Maccheroni, Fabio & Marinacci, Massimo, 2015. "Ambiguity and Nonexpected Utility," Handbook of Game Theory with Economic Applications,, Elsevier.
    8. Hui Chen & Nengjiu Ju & Jianjun Miao, 2014. "Dynamic Asset Allocation with Ambiguous Return Predictability," Review of Economic Dynamics, Elsevier for the Society for Economic Dynamics, vol. 17(4), pages 799-823, October.
    9. Nengjiu Ju & Jianjun Miao, 2012. "Ambiguity, Learning, and Asset Returns," Econometrica, Econometric Society, vol. 80(2), pages 559-591, March.
    10. Miao, Jianjun & Wang, Neng, 2011. "Risk, uncertainty, and option exercise," Journal of Economic Dynamics and Control, Elsevier, vol. 35(4), pages 442-461, April.
    11. Hengjie Ai & Ravi Bansal, 2016. "Risk Preferences and The Macro Announcement Premium," NBER Working Papers 22527, National Bureau of Economic Research, Inc.
    12. Boloori, Alireza & Saghafian, Soroush & Chakkera, Harini A. A. & Cook, Curtiss B., 2017. "Data-Driven Management of Post-transplant Medications: An APOMDP Approach," Working Paper Series rwp17-036, Harvard University, John F. Kennedy School of Government.
    13. Peter von zur Muehlen, 2022. "Prices and Taxes in a Ramsey Climate Policy Model under Heterogeneous Beliefs and Ambiguity," Economies, MDPI, vol. 10(10), pages 1-56, October.
    14. Kwon, Hyosung & Miao, Jianjun, 2017. "Three types of robust Ramsey problems in a linear-quadratic framework," Journal of Economic Dynamics and Control, Elsevier, vol. 76(C), pages 211-231.
    15. Li, Jian, 2020. "Preferences for partial information and ambiguity," Theoretical Economics, Econometric Society, vol. 15(3), July.
    16. Alireza Boloori & Soroush Saghafian & Harini A. Chakkera & Curtiss B. Cook, 2020. "Data-Driven Management of Post-transplant Medications: An Ambiguous Partially Observable Markov Decision Process Approach," Manufacturing & Service Operations Management, INFORMS, vol. 22(5), pages 1066-1087, September.
    17. Li, Jian & Zhou, Junjie, 2016. "Blackwell's informativeness ranking with uncertainty-averse preferences," Games and Economic Behavior, Elsevier, vol. 96(C), pages 18-29.
    18. Hansen, Lars Peter & Sargent, Thomas J., 2015. "Four types of ignorance," Journal of Monetary Economics, Elsevier, vol. 69(C), pages 97-113.
    19. Hansen, Lars Peter & Sargent, Thomas J., 2022. "Structured ambiguity and model misspecification," Journal of Economic Theory, Elsevier, vol. 199(C).
    20. Karantounias, Anastasios G., 2023. "Doubts about the model and optimal policy," Journal of Economic Theory, Elsevier, vol. 210(C).

    More about this item

    Keywords

    POMDP; Unknown probabilities; Model ambiguity; Structural results; Control-limit policies;
    All these keywords.

    JEL classification:

    • C61 - Mathematical and Quantitative Methods - - Mathematical Methods; Programming Models; Mathematical and Simulation Modeling - - - Optimization Techniques; Programming Models; Dynamic Analysis
    • D81 - Microeconomics - - Information, Knowledge, and Uncertainty - - - Criteria for Decision-Making under Risk and Uncertainty
    • D83 - Microeconomics - - Information, Knowledge, and Uncertainty - - - Search; Learning; Information and Knowledge; Communication; Belief; Unawareness
    • D84 - Microeconomics - - Information, Knowledge, and Uncertainty - - - Expectations; Speculations

    Statistics

    Access and download statistics

    Corrections

    All material on this site has been provided by the respective publishers and authors. You can help correct errors and omissions. When requesting a correction, please mention this item's handle: RePEc:eee:jetheo:v:178:y:2018:i:c:p:1-35. See general information about how to correct material in RePEc.

    If you have authored this item and are not yet registered with RePEc, we encourage you to do it here. This allows to link your profile to this item. It also allows you to accept potential citations to this item that we are uncertain about.

    If CitEc recognized a bibliographic reference but did not link an item in RePEc to it, you can help with this form .

    If you know of missing items citing this one, you can help us creating those links by adding the relevant references in the same way as above, for each refering item. If you are a registered author of this item, you may also want to check the "citations" tab in your RePEc Author Service profile, as there may be some citations waiting for confirmation.

    For technical questions regarding this item, or to correct its authors, title, abstract, bibliographic or download information, contact: Catherine Liu (email available below). General contact details of provider: http://www.elsevier.com/locate/inca/622869 .

    Please note that corrections may take a couple of weeks to filter through the various RePEc services.

    IDEAS is a RePEc service. RePEc uses bibliographic data supplied by the respective publishers.