Constrained Markov decision processes in Borel spaces: from discounted to average optimality

My bibliography Save this article

Constrained Markov decision processes in Borel spaces: from discounted to average optimality

Author

Listed:

Armando F. Mendoza-Pérez
(UNACH)
Héctor Jasso-Fuentes
(CINVESTAV–IPN)
Omar A. De-la-Cruz Courtois
(UNACH)

Registered:

Abstract

In this paper we study discrete-time Markov decision processes in Borel spaces with a finite number of constraints and with unbounded rewards and costs. Our aim is to provide a simple method to compute constrained optimal control policies when the payoff functions and the constraints are of either: infinite-horizon discounted type and average (a.k.a. ergodic) type. To deduce optimality results for the discounted case, we use the Lagrange multipliers method that rewrites the original problem (with constraints) into a parametric family of discounted unconstrained problems. Based on the dynamic programming technique as long with a simple use of elementary differential calculus, we obtain both suitable Lagrange multipliers and a family of control policies associated to these multipliers, this last family becomes optimal for the original problem with constraints. We next apply the vanishing discount factor method in order to obtain, in a straightforward way, optimal control policies associated to the average problem with constraints. Finally, to illustrate our results, we provide a simple application to linear–quadratic systems (LQ-systems).

Suggested Citation

Armando F. Mendoza-Pérez & Héctor Jasso-Fuentes & Omar A. De-la-Cruz Courtois, 2016. "Constrained Markov decision processes in Borel spaces: from discounted to average optimality," Mathematical Methods of Operations Research, Springer;Gesellschaft für Operations Research (GOR);Nederlands Genootschap voor Besliskunde (NGB), vol. 84(3), pages 489-525, December.

Handle: RePEc:spr:mathme:v:84:y:2016:i:3:d:10.1007_s00186-016-0551-3
DOI: 10.1007/s00186-016-0551-3

Download full text from publisher

As the access to this document is restricted, you may want to search for a different version of it.

References listed on IDEAS

Nishimura, Kazuo & Stachurski, John, 2007. "Stochastic optimal policies when the discount rate vanishes," Journal of Economic Dynamics and Control, Elsevier, vol. 31(4), pages 1416-1430, April.
- Kazuo Nishimura & John Stachurski, 2006. "Stochastic Optimal Policies When the Discout Rate Vanishes," KIER Working Papers 617, Kyoto University, Institute of Economic Research.
Yuanyao Ding & Rangcheng Jia & Shaoxiang Tang, 2003. "Dynamic principal agent model based on CMDP," Mathematical Methods of Operations Research, Springer;Gesellschaft für Operations Research (GOR);Nederlands Genootschap voor Besliskunde (NGB), vol. 58(1), pages 149-157, September.
Dutta, P.K., 1991. "What Do Discounted Optima Converge To? A Theory of Discount Rate Asymptotics in Economic Models," RCER Working Papers 264, University of Rochester - Center for Economic Research (RCER).
Richard Chen & Eugene Feinberg, 2007. "Non-randomized policies for constrained Markov decision processes," Mathematical Methods of Operations Research, Springer;Gesellschaft für Operations Research (GOR);Nederlands Genootschap voor Besliskunde (NGB), vol. 66(1), pages 165-179, August.
Lisa Korf, 2006. "Approximating infinite horizon stochastic optimal control in discrete time with constraints," Annals of Operations Research, Springer, vol. 142(1), pages 165-186, February.
Eugene A. Feinberg & Pavlo O. Kasyanov & Nina V. Zadoianchuk, 2012. "Average Cost Markov Decision Processes with Weakly Continuous Transition Probabilities," Mathematics of Operations Research, INFORMS, vol. 37(4), pages 591-607, November.
Dutta, Prajit K., 1991. "What do discounted optima converge to?: A theory of discount rate asymptotics in economic models," Journal of Economic Theory, Elsevier, vol. 55(1), pages 64-94, October.

Full references (including those not matched with items on IDEAS)

Most related items

These are the items that most often cite the same works as this one and are cited by the same works as this one.

Kitti, Mitri, 2018. "Sustainable social choice under risk," Mathematical Social Sciences, Elsevier, vol. 94(C), pages 19-31.
Alessandro Bonatti, 2008. "Continuous-Time Screening Contracts," 2008 Meeting Papers 493, Society for Economic Dynamics.
repec:ebl:ecbull:v:8:y:2003:i:5:p:1-10 is not listed on IDEAS
Ronald Wendner, 2003. "Status, environmental externality, and optimal tax programs," Economics Bulletin, AccessEcon, vol. 8(5), pages 1-10.
Squintani, Francesco & Valimaki, Juuso, 2002. "Imitation and Experimentation in Changing Contests," Journal of Economic Theory, Elsevier, vol. 104(2), pages 376-404, June.
Chichilnisky, Graciela & Beltratti, Andrea & Heal, Geoffrey, 1994. "The environment and the long run: A comparison of different criteria," MPRA Paper 7907, University Library of Munich, Germany.
Keller, Godfrey & Rady, Sven, 2020. "Undiscounted bandit games," Games and Economic Behavior, Elsevier, vol. 124(C), pages 43-61.
- Keller, Godfrey & Rady, Sven, 2015. "Undiscounted Bandit Games," Discussion Paper Series of SFB/TR 15 Governance and the Efficiency of Economic Systems 520, Free University of Berlin, Humboldt University of Berlin, University of Bonn, University of Mannheim, University of Munich.
- Godfrey Keller & Sven Rady, 2019. "Undiscounted Bandit Games," Economics Series Working Papers 882, University of Oxford, Department of Economics.
- Godfrey Keller & Sven Rady, 2019. "Undiscounted Bandit Games," CRC TR 224 Discussion Paper Series crctr224_2019_130, University of Bonn and University of Mannheim, Germany.
- Rady, Sven & Keller, R Godfrey, 2019. "Undiscounted Bandit Games," CEPR Discussion Papers 14046, C.E.P.R. Discussion Papers.
- Godfrey Keller & Sven Rady, 2020. "Undiscounted Bandit Games," CRC TR 224 Discussion Paper Series crctr224_2020_130v2, University of Bonn and University of Mannheim, Germany.
- Godfrey Keller & Sven Rady, 2019. "Undiscounted Bandit Games," Papers 1909.13323, arXiv.org, revised Aug 2020.
Iho Antti & Kitti Mitri, 2011. "A Tail-Payoff Puzzle in Dynamic Pollution Control," The B.E. Journal of Economic Analysis & Policy, De Gruyter, vol. 11(1), pages 1-30, May.
Nowak, Andrzej S., 2008. "Equilibrium in a dynamic game of capital accumulation with the overtaking criterion," Economics Letters, Elsevier, vol. 99(2), pages 233-237, May.
Dutta, Prajit K. & Radner, Roy, 2009. "A strategic analysis of global warming: Theory and some numbers," Journal of Economic Behavior & Organization, Elsevier, vol. 71(2), pages 187-209, August.
- Roy Radner & Prajit K. Dutta, 2005. "A Strategic Analysis of Global Warming: Theory and Some Numbers," Working Papers 05-03, New York University, Leonard N. Stern School of Business, Department of Economics.
Graciela Chichilnisky, 1996. "An axiomatic approach to sustainable development," Social Choice and Welfare, Springer;The Society for Social Choice and Welfare, vol. 13(2), pages 231-257, April.
- Chichilnisky, Graciela, 1995. "An axiomatic approach to sustainable development," MPRA Paper 8609, University Library of Munich, Germany.
Fleurbaey, Marc & Michel, Philippe, 2003. "Intertemporal equity and the extension of the Ramsey criterion," Journal of Mathematical Economics, Elsevier, vol. 39(7), pages 777-802, September.
- FLEURBAEY, Marc & MICHEL, Philippe, 1997. "Intertemporal equity and the extension of the Ramsey criterion," LIDAM Discussion Papers CORE 1997004, Université catholique de Louvain, Center for Operations Research and Econometrics (CORE).
- M. Fleurbaey & P. Michel, 1997. "Intertemporal equity and the extension of the Ramsey criterion," THEMA Working Papers 97-11, THEMA (THéorie Economique, Modélisation et Applications), Université de Cergy-Pontoise.
Kazuo Nishimura & John Stachurski, 2004. "Stochastic Optimal Growth when the Discount Rate Vanishes," Department of Economics - Working Papers Series 908, The University of Melbourne.
Lars J. Olson & Santanu Roy, 2006. "Theory of Stochastic Optimal Economic Growth," Springer Books, in: Rose-Anne Dana & Cuong Le Van & Tapan Mitra & Kazuo Nishimura (ed.), Handbook on Optimal Growth 1, chapter 11, pages 297-335, Springer.
- Olson, Lars J. & Roy, Santanu, 2005. "Theory of Stochastic Optimal Economic Growth," Working Papers 28601, University of Maryland, Department of Agricultural and Resource Economics.
Gerlagh, Reyer & Liski, Matti, 2008. "Strategic Resource Dependence," Economic Theory and Applications Working Papers 44222, Fondazione Eni Enrico Mattei (FEEM).
Nishimura, Kazuo & Stachurski, John, 2007. "Stochastic optimal policies when the discount rate vanishes," Journal of Economic Dynamics and Control, Elsevier, vol. 31(4), pages 1416-1430, April.
- Kazuo Nishimura & John Stachurski, 2006. "Stochastic Optimal Policies When the Discout Rate Vanishes," KIER Working Papers 617, Kyoto University, Institute of Economic Research.
Ghiglino, Christian & Tvede, Mich, 2000. "Optimal Policy in OG Models," Journal of Economic Theory, Elsevier, vol. 90(1), pages 62-83, January.
- Ghiglino, Christian & Tvede, Mich, 1998. "Optimal policy in OG models," Working Papers 08-1998, Copenhagen Business School, Department of Economics.
- Christian Ghiglino & Mich Tvede, 1999. "Optimal Policy in OG Models," Discussion Papers 99-23, University of Copenhagen. Department of Economics.
- Ghiglino, C. & Tvede, M., 1999. "Optimal Policy in OG Models," Papers 99-23, Carleton - School of Public Administration.
Gerlagh, Reyer & Liski, Matti, 2011. "Strategic resource dependence," Journal of Economic Theory, Elsevier, vol. 146(2), pages 699-727, March.
- Reyer Gerlagh & Matti Liski, 2008. "Strategic Resource Dependence," Working Papers 2008.72, Fondazione Eni Enrico Mattei.
Hakenes, Hendrik & Katolnik, Svetlana, 2017. "On the incentive effects of job rotation," European Economic Review, Elsevier, vol. 98(C), pages 424-441.
Urmee Khan & Maxwell Stinchcombe, 2014. "Patient Preferences, Intergenerational Equity, and the Precautionary Principle," Working Papers 201427, University of California at Riverside, Department of Economics.
Steinmetz, Alexander, 2010. "Competition, innovation, and the effect of knowledge accumulation," W.E.P. - Würzburg Economic Papers 81, University of Würzburg, Department of Economics.

More about this item

Keywords

Markov decision processes; Constrained control problems; Vanishing discount approach; Lagrange multipliers;
All these keywords.

Statistics

Access and download statistics

Corrections

All material on this site has been provided by the respective publishers and authors. You can help correct errors and omissions. When requesting a correction, please mention this item's handle: RePEc:spr:mathme:v:84:y:2016:i:3:d:10.1007_s00186-016-0551-3. See general information about how to correct material in RePEc.

If you have authored this item and are not yet registered with RePEc, we encourage you to do it here. This allows to link your profile to this item. It also allows you to accept potential citations to this item that we are uncertain about.

If CitEc recognized a bibliographic reference but did not link an item in RePEc to it, you can help with this form .

If you know of missing items citing this one, you can help us creating those links by adding the relevant references in the same way as above, for each refering item. If you are a registered author of this item, you may also want to check the "citations" tab in your RePEc Author Service profile, as there may be some citations waiting for confirmation.

For technical questions regarding this item, or to correct its authors, title, abstract, bibliographic or download information, contact: Sonal Shukla or Springer Nature Abstracting and Indexing (email available below). General contact details of provider: http://www.springer.com .

Please note that corrections may take a couple of weeks to filter through the various RePEc services.

IDEAS is a RePEc service. RePEc uses bibliographic data supplied by the respective publishers.

Browse Econ Literature

More features

Constrained Markov decision processes in Borel spaces: from discounted to average optimality

Author

Abstract

Suggested Citation

Download full text from publisher

References listed on IDEAS

Most related items

More about this item

Keywords

Statistics

Corrections

More services and features

MyIDEAS

Author registration

Rankings

RePEc Genealogy

RePEc Biblio

MPRA

New papers by email

EconAcademics

Plagiarism

About RePEc

RePEc home

Blog

Help/FAQ

RePEc team

Participating archives

Privacy statement

Help us

Corrections

Volunteers

Get papers listed

Open a RePEc archive

Get RePEc data