Approximate solutions to constrained risk-sensitive Markov decision processes

Approximate solutions to constrained risk-sensitive Markov decision processes

Author

Listed:

Kumar, Uday M
Bhat, Sanjay P.
Kavitha, Veeraruna
Hemachandra, Nandyala

Abstract

This paper considers the problem of finding near-optimal Markovian randomized (MR) policies for finite-state-action, infinite-horizon, constrained risk-sensitive Markov decision processes (CRSMDPs). Constraints are in the form of standard expected discounted cost functions as well as expected risk-sensitive discounted cost functions over finite and infinite horizons. We first show that the aforementioned CRSMDP optimization problem possesses a solution if it is feasible (that is, if there exists a policy which satisfies all the constraints). Secondly, we provide two methods for finding an approximate solution in the form of an ultimately stationary (US) MR policy. The latter is achieved through two approximating finite-horizon CRSMDPs constructed from the original CRSMDP by time-truncating the original objective and constraint cost functions, and suitably perturbing the constraint upper bounds. The first approximation gives a US policy which is ϵ-optimal and feasible for the original problem, while the second approximation gives a near-optimal US policy whose violation of the original constraints is bounded above by a specified tolerance value ϵ. A key step in the proofs is an appropriate choice of a metric that makes the set of infinite-horizon MR policies and the feasible regions of the three CRSMDPs compact, and the objective and constraint functions continuous. We also discuss two applications and use an infinite-horizon risk-sensitive inventory control problem as an example to illustrate how existing solution techniques may be used to solve the two approximate finite-horizon problems mentioned above.

Suggested Citation

Kumar, Uday M & Bhat, Sanjay P. & Kavitha, Veeraruna & Hemachandra, Nandyala, 2023. "Approximate solutions to constrained risk-sensitive Markov decision processes," European Journal of Operational Research, Elsevier, vol. 310(1), pages 249-267.

Handle: RePEc:eee:ejores:v:310:y:2023:i:1:p:249-267
DOI: 10.1016/j.ejor.2023.02.039

Download full text from publisher

As the access to this document is restricted, you may want to

for a different version of it.

References listed on IDEAS

Rubio-Herrero, Javier & Baykal-Gürsoy, Melike, 2020. "Mean-variance analysis of the newsvendor problem with price-dependent, isoelastic demand," European Journal of Operational Research, Elsevier, vol. 283(3), pages 942-953.
Abhilasha Prakash Katariya & Sila Cetinkaya & Eylem Tekin, 2014. "On the comparison of risk-neutral and risk-averse newsvendor problems," Journal of the Operational Research Society, Palgrave Macmillan;The OR Society, vol. 65(7), pages 1090-1107, July.
Eric V. Denardo & Haechurl Park & Uriel G. Rothblum, 2007. "Risk-Sensitive and Risk-Neutral Multiarmed Bandits," Mathematics of Operations Research, INFORMS, vol. 32(2), pages 374-394, May.
Stratton C. Jaquette, 1976. "A Utility Criterion for Markov Decision Processes," Management Science, INFORMS, vol. 23(1), pages 43-49, September.
Krishnamurthy Iyer & Nandyala Hemachandra, 2010. "Sensitivity analysis and optimal ultimately stationary deterministic policies in some constrained discounted cost models," Mathematical Methods of Operations Research, Springer;Gesellschaft für Operations Research (GOR);Nederlands Genootschap voor Besliskunde (NGB), vol. 71(3), pages 401-425, June.
Eugene A. Feinberg & Adam Shwartz, 1996. "Constrained Discounted Dynamic Programming," Mathematics of Operations Research, INFORMS, vol. 21(4), pages 922-945, November.
Mokrane Bouakiz & Matthew J. Sobel, 1992. "Inventory Control with an Exponential Utility Criterion," Operations Research, INFORMS, vol. 40(3), pages 603-608, June.
Ronald A. Howard & James E. Matheson, 1972. "Risk-Sensitive Markov Decision Processes," Management Science, INFORMS, vol. 18(7), pages 356-369, March.
Cyrus Derman & Morton Klein, 1965. "Some Remarks on Finite Horizon Markovian Decision Models," Operations Research, INFORMS, vol. 13(2), pages 272-278, April.
Choi, Sungyong & Ruszczynski, Andrzej, 2011. "A multi-product risk-averse newsvendor with exponential utility function," European Journal of Operational Research, Elsevier, vol. 214(1), pages 78-84, October.
Jerzy A. Filar & L. C. M. Kallenberg & Huey-Miin Lee, 1989. "Variance-Penalized Markov Decision Processes," Mathematics of Operations Research, INFORMS, vol. 14(1), pages 147-161, February.
Kamal Golabi & Ram B. Kulkarni & George B. Way, 1982. "A Statewide Pavement Management System," Interfaces, INFORMS, vol. 12(6), pages 5-21, December.

Full references (including those not matched with items on IDEAS)

Citations

Citations are extracted by the CitEc Project, subscribe to its RSS feed for this item.

Cited by:

Liu, Hui-hui & Yang, Guo-liang & Gao, Jian-wei & Wang, Ya-ping & Ni, Guo-hua, 2025. "Investigating the research and development performance of Chinese industry: A two-stage prospect data envelopment analysis approach," European Journal of Operational Research, Elsevier, vol. 323(3), pages 1040-1059.

Most related items

These are the items that most often cite the same works as this one and are cited by the same works as this one.

Nicole Bäuerle & Anna Jaśkiewicz, 2024. "Markov decision processes with risk-sensitive criteria: an overview," Mathematical Methods of Operations Research, Springer;Gesellschaft für Operations Research (GOR);Nederlands Genootschap voor Besliskunde (NGB), vol. 99(1), pages 141-178, April.
Krishnamurthy Iyer & Nandyala Hemachandra, 2010. "Sensitivity analysis and optimal ultimately stationary deterministic policies in some constrained discounted cost models," Mathematical Methods of Operations Research, Springer;Gesellschaft für Operations Research (GOR);Nederlands Genootschap voor Besliskunde (NGB), vol. 71(3), pages 401-425, June.
Nicole Bäuerle & Ulrich Rieder, 2014. "More Risk-Sensitive Markov Decision Processes," Mathematics of Operations Research, INFORMS, vol. 39(1), pages 105-120, February.
Monahan, George E. & Sobel, Matthew J., 1997. "Risk-Sensitive Dynamic Market Share Attraction Games," Games and Economic Behavior, Elsevier, vol. 20(2), pages 149-160, August.
Özlem Çavuş & Andrzej Ruszczyński, 2014. "Computational Methods for Risk-Averse Undiscounted Transient Markov Models," Operations Research, INFORMS, vol. 62(2), pages 401-417, April.
Pelin Canbolat, 2014. "Optimal halting policies in Markov population decision chains with constant risk posture," Annals of Operations Research, Springer, vol. 222(1), pages 227-237, November.
Karel Sladký, 2013. "Risk-Sensitive and Mean Variance Optimality in Markov Decision Processes," Czech Economic Review, Charles University Prague, Faculty of Social Sciences, Institute of Economic Studies, vol. 7(3), pages 146-161, November.
Sen Lin & Bo Li & Antonio Arreola-Risa & Yiwei Huang, 2023. "Optimizing a single-product production-inventory system under constant absolute risk aversion," TOP: An Official Journal of the Spanish Society of Statistics and Operations Research, Springer;Sociedad de Estadística e Investigación Operativa, vol. 31(3), pages 510-537, October.
Chan, Chi Kin & Lee, Y.C.E. & Campbell, J.F., 2013. "Environmental performance—Impacts of vendor–buyer coordination," International Journal of Production Economics, Elsevier, vol. 145(2), pages 683-695.
Lucy Gongtao Chen & Daniel Zhuoyu Long & Melvyn Sim, 2015. "On Dynamic Decision Making to Meet Consumption Targets," Operations Research, INFORMS, vol. 63(5), pages 1117-1130, October.
Li Chen & Melvyn Sim, 2025. "Robust CARA Optimization," Operations Research, INFORMS, vol. 73(3), pages 1459-1478, May.
Narayanan, Pranadharthiharan & Somasundaram, Jeeva & Seifert, Matthias, 2025. "Risk-averse algorithmic support and inventory management," European Journal of Operational Research, Elsevier, vol. 322(3), pages 993-1004.
Zeynep Erkin & Matthew D. Bailey & Lisa M. Maillart & Andrew J. Schaefer & Mark S. Roberts, 2010. "Eliciting Patients' Revealed Preferences: An Inverse Markov Decision Process Approach," Decision Analysis, INFORMS, vol. 7(4), pages 358-365, December.
Li, Xiang & Qi, Xiangtong & Li, Yongjian, 2021. "On sales effort and pricing decisions under alternative risk criteria," European Journal of Operational Research, Elsevier, vol. 293(2), pages 603-614.
Eugene A. Feinberg & Uriel G. Rothblum, 2012. "Splitting Randomized Stationary Policies in Total-Reward Markov Decision Processes," Mathematics of Operations Research, INFORMS, vol. 37(1), pages 129-153, February.
C. Barz & K. Waldmann, 2007. "Risk-sensitive capacity control in revenue management," Mathematical Methods of Operations Research, Springer;Gesellschaft für Operations Research (GOR);Nederlands Genootschap voor Besliskunde (NGB), vol. 65(3), pages 565-579, June.
Nandyala Hemachandra & Kamma Sri Naga Rajesh & Mohd. Abdul Qavi, 2016. "A model for equilibrium in some service-provider user-set interactions," Annals of Operations Research, Springer, vol. 243(1), pages 95-115, August.
HuiChen Chiang, 2007. "Financial intermediary's choice of borrowing," Applied Economics, Taylor & Francis Journals, vol. 40(2), pages 251-260.
Kang Boda & Jerzy Filar, 2006. "Time Consistent Dynamic Risk Measures," Mathematical Methods of Operations Research, Springer;Gesellschaft für Operations Research (GOR);Nederlands Genootschap voor Besliskunde (NGB), vol. 63(1), pages 169-186, February.
Wang, Qiangqiang, 2025. "Innovations in digital supply chain finance: mitigating payment term and liquidity risks for SMEs," Transportation Research Part E: Logistics and Transportation Review, Elsevier, vol. 203(C).

More about this item

Keywords

; ; ; ; ;

Statistics

Access and download statistics

Corrections

All material on this site has been provided by the respective publishers and authors. You can help correct errors and omissions. When requesting a correction, please mention this item's handle: RePEc:eee:ejores:v:310:y:2023:i:1:p:249-267. See general information about how to correct material in RePEc.

If you have authored this item and are not yet registered with RePEc, we encourage you to do it here. This allows to link your profile to this item. It also allows you to accept potential citations to this item that we are uncertain about.

If CitEc recognized a bibliographic reference but did not link an item in RePEc to it, you can help with this form .

If you know of missing items citing this one, you can help us creating those links by adding the relevant references in the same way as above, for each refering item. If you are a registered author of this item, you may also want to check the "citations" tab in your RePEc Author Service profile, as there may be some citations waiting for confirmation.

For technical questions regarding this item, or to correct its authors, title, abstract, bibliographic or download information, contact: Catherine Liu (email available below). General contact details of provider: http://www.elsevier.com/locate/eor .

Please note that corrections may take a couple of weeks to filter through the various RePEc services.

IDEAS is a RePEc service. RePEc uses bibliographic data supplied by the respective publishers.

Browse Econ Literature

More features

Approximate solutions to constrained risk-sensitive Markov decision processes

Author

Abstract

Suggested Citation

Download full text from publisher

References listed on IDEAS

Citations

Most related items

More about this item

Keywords

Statistics

Corrections

More services and features

MyIDEAS

Author registration

Rankings

RePEc Genealogy

RePEc Biblio

MPRA

New papers by email

EconAcademics

Plagiarism

About RePEc

RePEc home

Blog

Help/FAQ

RePEc team

Participating archives

Privacy statement

Help us

Corrections

Volunteers

Get papers listed

Open a RePEc archive

Get RePEc data