IDEAS home Printed from https://ideas.repec.org/a/spr/mathme/v92y2020i1d10.1007_s00186-020-00706-w.html
   My bibliography  Save this article

First-order sensitivity of the optimal value in a Markov decision model with respect to deviations in the transition probability function

Author

Listed:
  • Patrick Kern

    (Saarland University)

  • Axel Simroth

    (Fraunhofer Institute for Transportation and Infrastructure Systems)

  • Henryk Zähle

    (Saarland University)

Abstract

Markov decision models (MDM) used in practical applications are most often less complex than the underlying ‘true’ MDM. The reduction of model complexity is performed for several reasons. However, it is obviously of interest to know what kind of model reduction is reasonable (in regard to the optimal value) and what kind is not. In this article we propose a way how to address this question. We introduce a sort of derivative of the optimal value as a function of the transition probabilities, which can be used to measure the (first-order) sensitivity of the optimal value w.r.t. changes in the transition probabilities. ‘Differentiability’ is obtained for a fairly broad class of MDMs, and the ‘derivative’ is specified explicitly. Our theoretical findings are illustrated by means of optimization problems in inventory control and mathematical finance.

Suggested Citation

  • Patrick Kern & Axel Simroth & Henryk Zähle, 2020. "First-order sensitivity of the optimal value in a Markov decision model with respect to deviations in the transition probability function," Mathematical Methods of Operations Research, Springer;Gesellschaft für Operations Research (GOR);Nederlands Genootschap voor Besliskunde (NGB), vol. 92(1), pages 165-197, August.
  • Handle: RePEc:spr:mathme:v:92:y:2020:i:1:d:10.1007_s00186-020-00706-w
    DOI: 10.1007/s00186-020-00706-w
    as

    Download full text from publisher

    File URL: http://link.springer.com/10.1007/s00186-020-00706-w
    File Function: Abstract
    Download Restriction: Access to the full text of the articles in this series is restricted.

    File URL: https://libkey.io/10.1007/s00186-020-00706-w?utm_source=ideas
    LibKey link: if access is restricted and if your library uses this service, LibKey will redirect you to where you can use your library subscription to access this item
    ---><---

    As the access to this document is restricted, you may want to search for a different version of it.

    References listed on IDEAS

    as
    1. Alfred Müller, 1997. "How Does the Value Function of a Markov Decision Process Depend on the Transition Probabilities?," Mathematics of Operations Research, INFORMS, vol. 22(4), pages 872-885, November.
    2. Bellini, Fabio & Klar, Bernhard & Müller, Alfred & Rosazza Gianin, Emanuela, 2014. "Generalized quantiles as risk measures," Insurance: Mathematics and Economics, Elsevier, vol. 54(C), pages 41-48.
    3. Rüdiger Kiesel & Robin Rühlicke & Gerhard Stahl & Jinsong Zheng, 2016. "The Wasserstein Metric and Robustness in Risk Management," Risks, MDPI, vol. 4(3), pages 1-14, August.
    4. Krätschmer, Volker & Schied, Alexander & Zähle, Henryk, 2017. "Domains of weak continuity of statistical functionals with a view toward robust statistics," Journal of Multivariate Analysis, Elsevier, vol. 158(C), pages 1-19.
    5. Merton, Robert C, 1969. "Lifetime Portfolio Selection under Uncertainty: The Continuous-Time Case," The Review of Economics and Statistics, MIT Press, vol. 51(3), pages 247-257, August.
    6. Krätschmer, Volker & Schied, Alexander & Zähle, Henryk, 2012. "Qualitative and infinitesimal robustness of tail-dependent statistical functionals," Journal of Multivariate Analysis, Elsevier, vol. 103(1), pages 35-47, January.
    7. K. Hinderer, 2005. "Lipschitz Continuity of Value Functions in Markovian Decision Processes," Mathematical Methods of Operations Research, Springer;Gesellschaft für Operations Research (GOR);Nederlands Genootschap voor Besliskunde (NGB), vol. 62(1), pages 3-22, September.
    Full references (including those not matched with items on IDEAS)

    Citations

    Citations are extracted by the CitEc Project, subscribe to its RSS feed for this item.
    as


    Cited by:

    1. Henryk Zähle, 2022. "A concept of copula robustness and its applications in quantitative risk management," Finance and Stochastics, Springer, vol. 26(4), pages 825-875, October.

    Most related items

    These are the items that most often cite the same works as this one and are cited by the same works as this one.
    1. Jayakumar Subramanian & Amit Sinha & Aditya Mahajan, 2023. "Robustness and Sample Complexity of Model-Based MARL for General-Sum Markov Games," Dynamic Games and Applications, Springer, vol. 13(1), pages 56-88, March.
    2. Henryk Zähle, 2022. "A concept of copula robustness and its applications in quantitative risk management," Finance and Stochastics, Springer, vol. 26(4), pages 825-875, October.
    3. Jan Obloj & Johannes Wiesel, 2018. "Robust estimation of superhedging prices," Papers 1807.04211, arXiv.org, revised Apr 2020.
    4. Marcelo Brutti Righi, 2018. "A theory for combinations of risk measures," Papers 1807.01977, arXiv.org, revised May 2023.
    5. James Ming Chen, 2018. "On Exactitude in Financial Regulation: Value-at-Risk, Expected Shortfall, and Expectiles," Risks, MDPI, vol. 6(2), pages 1-28, June.
    6. Krätschmer Volker & Schied Alexander & Zähle Henryk, 2015. "Quasi-Hadamard differentiability of general risk functionals and its application," Statistics & Risk Modeling, De Gruyter, vol. 32(1), pages 25-47, April.
    7. Tobias Fissler & Hajo Holzmann, 2022. "Measurability of functionals and of ideal point forecasts," Papers 2203.08635, arXiv.org.
    8. Sainan Zhang & Huifu Xu, 2022. "Insurance premium-based shortfall risk measure induced by cumulative prospect theory," Computational Management Science, Springer, vol. 19(4), pages 703-738, October.
    9. Anne Lavigne, 2006. "Gouvernance et investissement des fonds de pension privés aux Etats-Unis," Working Papers halshs-00081401, HAL.
    10. An Chen & Thai Nguyen & Thorsten Sehner, 2022. "Unit-Linked Tontine: Utility-Based Design, Pricing and Performance," Risks, MDPI, vol. 10(4), pages 1-27, April.
    11. Alan J. Auerbach, 1981. "Evaluating the Taxation of Risky Assets," NBER Working Papers 0806, National Bureau of Economic Research, Inc.
    12. Hong, Claire Yurong & Lu, Xiaomeng & Pan, Jun, 2021. "FinTech adoption and household risk-taking," BOFIT Discussion Papers 14/2021, Bank of Finland Institute for Emerging Economies (BOFIT).
    13. Curatola, Giuliano, 2022. "Price impact, strategic interaction and portfolio choice," The North American Journal of Economics and Finance, Elsevier, vol. 59(C).
    14. Auffret, Philippe, 2001. "An alternative unifying measure of welfare gains from risk-sharing," Policy Research Working Paper Series 2676, The World Bank.
    15. Taoufik Bouezmarni & Mohamed Doukali & Abderrahim Taamouti, 2023. "Testing Granger Non-Causality in Expectiles," University of East Anglia School of Economics Working Paper Series 2023-02, School of Economics, University of East Anglia, Norwich, UK..
    16. Chen, An & Hieber, Peter & Sureth, Caren, 2022. "Pay for tax certainty? Advance tax rulings for risky investment under multi-dimensional tax uncertainty," arqus Discussion Papers in Quantitative Tax Research 273, arqus - Arbeitskreis Quantitative Steuerlehre.
    17. Mayank Goel & Suresh Kumar K., 2006. "A Risk-Sensitive Portfolio Optimisation Problem with Stochastic Interest Rate," Journal of Emerging Market Finance, Institute for Financial Management and Research, vol. 5(3), pages 263-282, December.
    18. Andreas Fagereng & Luigi Guiso & Davide Malacrino & Luigi Pistaferri, 2020. "Heterogeneity and Persistence in Returns to Wealth," Econometrica, Econometric Society, vol. 88(1), pages 115-170, January.
    19. Yuqian Xu & Lingjiong Zhu & Michael Pinedo, 2020. "Operational Risk Management: A Stochastic Control Framework with Preventive and Corrective Controls," Operations Research, INFORMS, vol. 68(6), pages 1804-1825, November.
    20. Yuki SHIGETA, 2022. "A Continuous-Time Utility Maximization Problem with Borrowing Constraints in Macroeconomic Heterogeneous Agent Models:A Case of Regular Controls under Markov Chain Uncertainty," Discussion papers e-22-009, Graduate School of Economics , Kyoto University.

    Corrections

    All material on this site has been provided by the respective publishers and authors. You can help correct errors and omissions. When requesting a correction, please mention this item's handle: RePEc:spr:mathme:v:92:y:2020:i:1:d:10.1007_s00186-020-00706-w. See general information about how to correct material in RePEc.

    If you have authored this item and are not yet registered with RePEc, we encourage you to do it here. This allows to link your profile to this item. It also allows you to accept potential citations to this item that we are uncertain about.

    If CitEc recognized a bibliographic reference but did not link an item in RePEc to it, you can help with this form .

    If you know of missing items citing this one, you can help us creating those links by adding the relevant references in the same way as above, for each refering item. If you are a registered author of this item, you may also want to check the "citations" tab in your RePEc Author Service profile, as there may be some citations waiting for confirmation.

    For technical questions regarding this item, or to correct its authors, title, abstract, bibliographic or download information, contact: Sonal Shukla or Springer Nature Abstracting and Indexing (email available below). General contact details of provider: http://www.springer.com .

    Please note that corrections may take a couple of weeks to filter through the various RePEc services.

    IDEAS is a RePEc service. RePEc uses bibliographic data supplied by the respective publishers.