IDEAS home Printed from https://ideas.repec.org/a/inm/oropre/v58y2010i1p203-213.html
   My bibliography  Save this article

Percentile Optimization for Markov Decision Processes with Parameter Uncertainty

Author

Listed:
  • Erick Delage

    (Department of Management Science, HEC Montréal, Montréal, Quebec H3T 2A7, Canada)

  • Shie Mannor

    (Department of Electrical and Computer Engineering, McGill University, Montreal, Quebec H3A 2A7, Canada)

Abstract

Markov decision processes are an effective tool in modeling decision making in uncertain dynamic environments. Because the parameters of these models typically are estimated from data or learned from experience, it is not surprising that the actual performance of a chosen strategy often differs significantly from the designer's initial expectations due to unavoidable modeling ambiguity. In this paper, we present a set of percentile criteria that are conceptually natural and representative of the trade-off between optimistic and pessimistic views of the question. We study the use of these criteria under different forms of uncertainty for both the rewards and the transitions. Some forms are shown to be efficiently solvable and others highly intractable. In each case, we outline solution concepts that take parametric uncertainty into account in the process of decision making.

Suggested Citation

  • Erick Delage & Shie Mannor, 2010. "Percentile Optimization for Markov Decision Processes with Parameter Uncertainty," Operations Research, INFORMS, vol. 58(1), pages 203-213, February.
  • Handle: RePEc:inm:oropre:v:58:y:2010:i:1:p:203-213
    DOI: 10.1287/opre.1080.0685
    as

    Download full text from publisher

    File URL: http://dx.doi.org/10.1287/opre.1080.0685
    Download Restriction: no

    File URL: https://libkey.io/10.1287/opre.1080.0685?utm_source=ideas
    LibKey link: if access is restricted and if your library uses this service, LibKey will redirect you to where you can use your library subscription to access this item
    ---><---

    References listed on IDEAS

    as
    1. Shie Mannor & Duncan Simester & Peng Sun & John N. Tsitsiklis, 2007. "Bias and Variance Approximation in Value Function Estimates," Management Science, INFORMS, vol. 53(2), pages 308-322, February.
    2. Ronald A. Howard & James E. Matheson, 1972. "Risk-Sensitive Markov Decision Processes," Management Science, INFORMS, vol. 18(7), pages 356-369, March.
    3. Garud N. Iyengar, 2005. "Robust Dynamic Programming," Mathematics of Operations Research, INFORMS, vol. 30(2), pages 257-280, May.
    4. A. Ben-Tal & A. Nemirovski, 1998. "Robust Convex Optimization," Mathematics of Operations Research, INFORMS, vol. 23(4), pages 769-805, November.
    5. Jay K. Satia & Roy E. Lave, 1973. "Markovian Decision Processes with Uncertain Transition Probabilities," Operations Research, INFORMS, vol. 21(3), pages 728-740, June.
    6. Arnab Nilim & Laurent El Ghaoui, 2005. "Robust Control of Markov Decision Processes with Uncertain Transition Matrices," Operations Research, INFORMS, vol. 53(5), pages 780-798, October.
    7. G. C. Calafiore & L. El Ghaoui, 2006. "On Distributionally Robust Chance-Constrained Linear Programs," Journal of Optimization Theory and Applications, Springer, vol. 130(1), pages 1-22, July.
    8. J. K. Satia & R. E. Lave, 1973. "Markovian Decision Processes with Probabilistic Observation of States," Management Science, INFORMS, vol. 20(1), pages 1-13, September.
    Full references (including those not matched with items on IDEAS)

    Citations

    Citations are extracted by the CitEc Project, subscribe to its RSS feed for this item.
    as


    Cited by:

    1. Anthony Coache & Sebastian Jaimungal, 2021. "Reinforcement Learning with Dynamic Convex Risk Measures," Papers 2112.13414, arXiv.org, revised Nov 2022.
    2. Zhu, Zhicheng & Xiang, Yisha & Zhao, Ming & Shi, Yue, 2023. "Data-driven remanufacturing planning with parameter uncertainty," European Journal of Operational Research, Elsevier, vol. 309(1), pages 102-116.
    3. Zeynep Turgay & Fikri Karaesmen & Egemen Lerzan Örmeci, 2018. "Structural properties of a class of robust inventory and queueing control problems," Naval Research Logistics (NRL), John Wiley & Sons, vol. 65(8), pages 699-716, December.
    4. Chernonog, Tatyana & Avinadav, Tal, 2014. "Profit criteria involving risk in price setting of virtual products," European Journal of Operational Research, Elsevier, vol. 236(1), pages 351-360.
    5. Huan Xu & Constantine Caramanis & Shie Mannor, 2012. "Optimization Under Probabilistic Envelope Constraints," Operations Research, INFORMS, vol. 60(3), pages 682-699, June.
    6. Bren, Austin & Saghafian, Soroush, 2018. "Data-Driven Percentile Optimization for Multi-Class Queueing Systems with Model Ambiguity: Theory and Application," Working Paper Series rwp18-008, Harvard University, John F. Kennedy School of Government.
    7. Maximilian Blesch & Philipp Eisenhauer, 2023. "Robust Decision-Making under Risk and Ambiguity," Rationality and Competition Discussion Paper Series 463, CRC TRR 190 Rationality and Competition.
    8. Saghafian, Soroush, 2018. "Ambiguous partially observable Markov decision processes: Structural results and applications," Journal of Economic Theory, Elsevier, vol. 178(C), pages 1-35.
    9. Huan Xu & Constantine Caramanis & Shie Mannor, 2012. "A Distributional Interpretation of Robust Optimization," Mathematics of Operations Research, INFORMS, vol. 37(1), pages 95-110, February.
    10. Shie Mannor & Ofir Mebel & Huan Xu, 2016. "Robust MDPs with k -Rectangular Uncertainty," Mathematics of Operations Research, INFORMS, vol. 41(4), pages 1484-1509, November.
    11. Li Xia, 2020. "Risk‐Sensitive Markov Decision Processes with Combined Metrics of Mean and Variance," Production and Operations Management, Production and Operations Management Society, vol. 29(12), pages 2808-2827, December.
    12. Boloori, Alireza & Saghafian, Soroush & Chakkera, Harini A. A. & Cook, Curtiss B., 2017. "Data-Driven Management of Post-transplant Medications: An APOMDP Approach," Working Paper Series rwp17-036, Harvard University, John F. Kennedy School of Government.
    13. David L. Kaufman & Andrew J. Schaefer, 2013. "Robust Modified Policy Iteration," INFORMS Journal on Computing, INFORMS, vol. 25(3), pages 396-410, August.
    14. Felipe Caro & Aparupa Das Gupta, 2022. "Robust control of the multi-armed bandit problem," Annals of Operations Research, Springer, vol. 317(2), pages 461-480, October.
    15. Alireza Boloori & Soroush Saghafian & Harini A. Chakkera & Curtiss B. Cook, 2020. "Data-Driven Management of Post-transplant Medications: An Ambiguous Partially Observable Markov Decision Process Approach," Manufacturing & Service Operations Management, INFORMS, vol. 22(5), pages 1066-1087, September.
    16. Saghafian, Soroush & Tomlin, Brian & Biller, Stephan, 2018. "The Internet of Things and Information Fusion: Who Talks to Who?," Working Paper Series rwp18-009, Harvard University, John F. Kennedy School of Government.
    17. Wolfram Wiesemann & Daniel Kuhn & Berç Rustem, 2013. "Robust Markov Decision Processes," Mathematics of Operations Research, INFORMS, vol. 38(1), pages 153-183, February.
    18. V Varagapriya & Vikas Vikram Singh & Abdel Lisser, 2023. "Joint chance-constrained Markov decision processes," Annals of Operations Research, Springer, vol. 322(2), pages 1013-1035, March.
    19. Maximilian Blesch & Philipp Eisenhauer, 2021. "Robust Decision-Making Under Risk and Ambiguity," ECONtribute Discussion Papers Series 104, University of Bonn and University of Cologne, Germany.
    20. Huan Xu & Shie Mannor, 2012. "Distributionally Robust Markov Decision Processes," Mathematics of Operations Research, INFORMS, vol. 37(2), pages 288-300, May.

    Most related items

    These are the items that most often cite the same works as this one and are cited by the same works as this one.
    1. Wolfram Wiesemann & Daniel Kuhn & Berç Rustem, 2013. "Robust Markov Decision Processes," Mathematics of Operations Research, INFORMS, vol. 38(1), pages 153-183, February.
    2. V Varagapriya & Vikas Vikram Singh & Abdel Lisser, 2023. "Joint chance-constrained Markov decision processes," Annals of Operations Research, Springer, vol. 322(2), pages 1013-1035, March.
    3. Zeynep Turgay & Fikri Karaesmen & Egemen Lerzan Örmeci, 2018. "Structural properties of a class of robust inventory and queueing control problems," Naval Research Logistics (NRL), John Wiley & Sons, vol. 65(8), pages 699-716, December.
    4. Shie Mannor & Ofir Mebel & Huan Xu, 2016. "Robust MDPs with k -Rectangular Uncertainty," Mathematics of Operations Research, INFORMS, vol. 41(4), pages 1484-1509, November.
    5. David L. Kaufman & Andrew J. Schaefer, 2013. "Robust Modified Policy Iteration," INFORMS Journal on Computing, INFORMS, vol. 25(3), pages 396-410, August.
    6. Andrew J. Keith & Darryl K. Ahner, 2021. "A survey of decision making and optimization under uncertainty," Annals of Operations Research, Springer, vol. 300(2), pages 319-353, May.
    7. Huan Xu & Shie Mannor, 2012. "Distributionally Robust Markov Decision Processes," Mathematics of Operations Research, INFORMS, vol. 37(2), pages 288-300, May.
    8. Peter Buchholz & Dimitri Scheftelowitsch, 2019. "Computation of weighted sums of rewards for concurrent MDPs," Mathematical Methods of Operations Research, Springer;Gesellschaft für Operations Research (GOR);Nederlands Genootschap voor Besliskunde (NGB), vol. 89(1), pages 1-42, February.
    9. Zeynep Turgay & Fikri Karaesmen & E. Örmeci, 2015. "A dynamic inventory rationing problem with uncertain demand and production rates," Annals of Operations Research, Springer, vol. 231(1), pages 207-228, August.
    10. Erim Kardeş & Fernando Ordóñez & Randolph W. Hall, 2011. "Discounted Robust Stochastic Games and an Application to Queueing Control," Operations Research, INFORMS, vol. 59(2), pages 365-382, April.
    11. Shiau Hong Lim & Huan Xu & Shie Mannor, 2016. "Reinforcement Learning in Robust Markov Decision Processes," Mathematics of Operations Research, INFORMS, vol. 41(4), pages 1325-1353, November.
    12. Wolfram Wiesemann & Daniel Kuhn & Berç Rustem, 2010. "Robust Markov Decision Processes," Working Papers 034, COMISEF.
    13. Arthur Flajolet & Sébastien Blandin & Patrick Jaillet, 2018. "Robust Adaptive Routing Under Uncertainty," Operations Research, INFORMS, vol. 66(1), pages 210-229, January.
    14. Bren, Austin & Saghafian, Soroush, 2018. "Data-Driven Percentile Optimization for Multi-Class Queueing Systems with Model Ambiguity: Theory and Application," Working Paper Series rwp18-008, Harvard University, John F. Kennedy School of Government.
    15. Bakker, Hannah & Dunke, Fabian & Nickel, Stefan, 2020. "A structuring review on multi-stage optimization under uncertainty: Aligning concepts from theory and practice," Omega, Elsevier, vol. 96(C).
    16. Maximilian Blesch & Philipp Eisenhauer, 2023. "Robust Decision-Making under Risk and Ambiguity," Rationality and Competition Discussion Paper Series 463, CRC TRR 190 Rationality and Competition.
    17. Zhu, Zhicheng & Xiang, Yisha & Zhao, Ming & Shi, Yue, 2023. "Data-driven remanufacturing planning with parameter uncertainty," European Journal of Operational Research, Elsevier, vol. 309(1), pages 102-116.
    18. Michael Jong Kim & Andrew E.B. Lim, 2016. "Robust Multiarmed Bandit Problems," Management Science, INFORMS, vol. 62(1), pages 264-285, January.
    19. Schapaugh, Adam W. & Tyre, Andrew J., 2013. "Accounting for parametric uncertainty in Markov decision processes," Ecological Modelling, Elsevier, vol. 254(C), pages 15-21.
    20. Felipe Caro & Aparupa Das Gupta, 2022. "Robust control of the multi-armed bandit problem," Annals of Operations Research, Springer, vol. 317(2), pages 461-480, October.

    Corrections

    All material on this site has been provided by the respective publishers and authors. You can help correct errors and omissions. When requesting a correction, please mention this item's handle: RePEc:inm:oropre:v:58:y:2010:i:1:p:203-213. See general information about how to correct material in RePEc.

    If you have authored this item and are not yet registered with RePEc, we encourage you to do it here. This allows to link your profile to this item. It also allows you to accept potential citations to this item that we are uncertain about.

    If CitEc recognized a bibliographic reference but did not link an item in RePEc to it, you can help with this form .

    If you know of missing items citing this one, you can help us creating those links by adding the relevant references in the same way as above, for each refering item. If you are a registered author of this item, you may also want to check the "citations" tab in your RePEc Author Service profile, as there may be some citations waiting for confirmation.

    For technical questions regarding this item, or to correct its authors, title, abstract, bibliographic or download information, contact: Chris Asher (email available below). General contact details of provider: https://edirc.repec.org/data/inforea.html .

    Please note that corrections may take a couple of weeks to filter through the various RePEc services.

    IDEAS is a RePEc service. RePEc uses bibliographic data supplied by the respective publishers.