An Approximate Dynamic Programming Algorithm for Monotone Value Functions

My bibliography Save this article

An Approximate Dynamic Programming Algorithm for Monotone Value Functions

Author

Listed:

Daniel R. Jiang
(Department of Operations Research and Financial Engineering, Princeton University, Princeton, New Jersey 08540)
Warren B. Powell
(Department of Operations Research and Financial Engineering, Princeton University, Princeton, New Jersey 08540)

Registered:

Abstract

Many sequential decision problems can be formulated as Markov decision processes (MDPs) where the optimal value function (or cost-to-go function) can be shown to satisfy a monotone structure in some or all of its dimensions. When the state space becomes large, traditional techniques, such as the backward dynamic programming algorithm (i.e., backward induction or value iteration), may no longer be effective in finding a solution within a reasonable time frame, and thus we are forced to consider other approaches, such as approximate dynamic programming (ADP). We propose a provably convergent ADP algorithm called Monotone-ADP that exploits the monotonicity of the value functions to increase the rate of convergence. In this paper, we describe a general finite-horizon problem setting where the optimal value function is monotone, present a convergence proof for Monotone-ADP under various technical assumptions, and show numerical results for three application domains: optimal stopping , energy storage / allocation , and glycemic control for diabetes patients . The empirical results indicate that by taking advantage of monotonicity, we can attain high quality solutions within a relatively small number of iterations, using up to two orders of magnitude less computation than is needed to compute the optimal solution exactly.

Suggested Citation

Daniel R. Jiang & Warren B. Powell, 2015. "An Approximate Dynamic Programming Algorithm for Monotone Value Functions," Operations Research, INFORMS, vol. 63(6), pages 1489-1511, December.

Handle: RePEc:inm:oropre:v:63:y:2015:i:6:p:1489-1511
DOI: 10.1287/opre.2015.1425

Download full text from publisher

References listed on IDEAS

Rust, John, 1987. "Optimal Replacement of GMC Bus Engines: An Empirical Model of Harold Zurcher," Econometrica, Econometric Society, vol. 55(5), pages 999-1033, September.
J. O. Ramsay, 1998. "Estimating smooth monotone functions," Journal of the Royal Statistical Society Series B, Royal Statistical Society, vol. 60(2), pages 365-375.
Feldstein, Martin S & Rothschild, Michael, 1974. "Towards an Economic Theory of Replacement Investment," Econometrica, Econometric Society, vol. 42(3), pages 393-423, May.
Mason, J.E. & Denton, B.T. & Shah, N.D. & Smith, S.A., 2014. "Optimizing the simultaneous management of blood pressure and cholesterol for type 2 diabetes patients," European Journal of Operational Research, Elsevier, vol. 233(3), pages 727-738.
Alfred Müller, 1997. "How Does the Value Function of a Markov Decision Process Depend on the Transition Probabilities?," Mathematics of Operations Research, INFORMS, vol. 22(4), pages 872-885, November.
James E. Smith & Kevin F. McCardle, 2002. "Structural Properties of Stochastic Dynamic Programs," Operations Research, INFORMS, vol. 50(5), pages 796-809, October.
Jae Ho Kim & Warren B. Powell, 2011. "Optimal Energy Commitments with Storage and Intermittent Supply," Operations Research, INFORMS, vol. 59(6), pages 1347-1360, December.
J. J. McCall, 1970. "Economics of Information and Job Search," The Quarterly Journal of Economics, President and Fellows of Harvard College, vol. 84(1), pages 113-126.
Greg Kaplan & Giovanni L. Violante, 2014. "A Model of the Consumption Response to Fiscal Stimulus Payments," Econometrica, Econometric Society, vol. 82(4), pages 1199-1239, July.
- Greg Kaplan & Giovanni L. Violante, 2011. "A Model of the Consumption Response to Fiscal Stimulus Payments," NBER Working Papers 17338, National Bureau of Economic Research, Inc.
- Gianluca Violante & Greg Kaplan, 2011. "A Model of the Consumption Response to Fiscal Stimulus Payments," 2011 Meeting Papers 243, Society for Economic Dynamics.
- Violante, Giovanni & Kaplan, Greg, 2011. "A Model of the Consumption Response to Fiscal Stimulus Payments," CEPR Discussion Papers 8562, C.E.P.R. Discussion Papers.
Juliana M. Nascimento & Warren B. Powell, 2009. "An Optimal Approximate Dynamic Programming Algorithm for the Lagged Asset Acquisition Problem," Mathematics of Operations Research, INFORMS, vol. 34(1), pages 210-237, February.
Daniel R. Jiang & Warren B. Powell, 2015. "Optimal Hour-Ahead Bidding in the Real-Time Electricity Market with Battery Storage Using Approximate Dynamic Programming," INFORMS Journal on Computing, INFORMS, vol. 27(3), pages 525-543, August.
Warren Powell & Andrzej Ruszczyński & Huseyin Topaloglu, 2004. "Learning Algorithms for Separable Approximations of Discrete Stochastic Optimization Problems," Mathematics of Operations Research, INFORMS, vol. 29(4), pages 814-836, November.
Jennifer E. Mason & Darin A. England & Brian T. Denton & Steven A. Smith & Murat Kurt & Nilay D. Shah, 2012. "Optimizing Statin Treatment Decisions for Diabetes Patients in the Presence of Uncertain Future Adherence," Medical Decision Making, , vol. 32(1), pages 154-166, January.
Nicola Secomandi, 2010. "Optimal Commodity Trading with a Capacitated Storage Asset," Management Science, INFORMS, vol. 56(3), pages 449-467, March.
Rene Carmona & Michael Ludkovski, 2010. "Valuation of energy storage: an optimal switching approach," Quantitative Finance, Taylor & Francis Journals, vol. 10(4), pages 359-374.
Juliana Nascimento & Warren Powell, 2010. "Dynamic Programming Models and Algorithms for the Mutual Fund Cash Balance Problem," Management Science, INFORMS, vol. 56(5), pages 801-815, May.
Papadaki, Katerina P. & Powell, Warren B., 2002. "Exploiting structure in adaptive dynamic programming algorithms for a stochastic batch service problem," European Journal of Operational Research, Elsevier, vol. 142(1), pages 108-127, October.
John R. Birge, 1985. "Decomposition and Partitioning Methods for Multistage Stochastic Linear Programs," Operations Research, INFORMS, vol. 33(5), pages 989-1007, October.

Full references (including those not matched with items on IDEAS)

Citations

Citations are extracted by the CitEc Project, subscribe to its RSS feed for this item.

Cited by:

Achref Bachouch & Côme Huré & Nicolas Langrené & Huyen Pham, 2020. "Deep neural networks algorithms for stochastic control problems on finite horizon: numerical applications," Post-Print hal-01949221, HAL.
Alexandre, Rodrigo e Alvim & Fragoso, Marcelo D. & Filho, Virgílio J.M. Ferreira & Arruda, Edilson F., 2025. "Solving Markov decision processes via state space decomposition and time aggregation," European Journal of Operational Research, Elsevier, vol. 324(1), pages 155-167.
Daniel R. Jiang & Warren B. Powell, 2018. "Risk-Averse Approximate Dynamic Programming with Quantile-Based Risk Measures," Mathematics of Operations Research, INFORMS, vol. 43(2), pages 554-579, May.
Mousavi, Kianoush & Bodur, Merve & Cevik, Mucahit & Roorda, Matthew J., 2024. "Approximate dynamic programming for pickup and delivery problem with crowd-shipping," Transportation Research Part B: Methodological, Elsevier, vol. 187(C).
Fokkema, Jan Eise & uit het Broek, Michiel A.J. & Schrotenboer, Albert H. & Land, Martin J. & Van Foreest, Nicky D., 2022. "Seasonal hydrogen storage decisions under constrained electricity distribution capacity," Renewable Energy, Elsevier, vol. 195(C), pages 76-91.
Chen, Yao & Liu, Yang & Bai, Yun & Mao, Baohua, 2024. "Real-time dispatch management of shared autonomous vehicles with on-demand and pre-booked requests," Transportation Research Part A: Policy and Practice, Elsevier, vol. 181(C).
Daniel R. Jiang & Warren B. Powell, 2015. "Optimal Hour-Ahead Bidding in the Real-Time Electricity Market with Battery Storage Using Approximate Dynamic Programming," INFORMS Journal on Computing, INFORMS, vol. 27(3), pages 525-543, August.
Antoine Jacquier & Hao Liu, 2017. "Optimal liquidation in a Level-I limit order book for large tick stocks," Papers 1701.01327, arXiv.org, revised Nov 2017.
Andrew J. Collins & Patrick Hester & Barry Ezell & John Horst, 2016. "An improvement selection methodology for key performance indicators," Environment Systems and Decisions, Springer, vol. 36(2), pages 196-208, June.
Achref Bachouch & Côme Huré & Nicolas Langrené & Huyen Pham, 2019. "Deep neural networks algorithms for stochastic control problems on finite horizon: numerical applications," Working Papers hal-01949221, HAL.
Weitzel, Timm & Glock, Christoph H., 2018. "Energy management for stationary electric energy storage systems: A systematic literature review," European Journal of Operational Research, Elsevier, vol. 264(2), pages 582-606.
Achref Bachouch & Côme Huré & Nicolas Langrené & Huyên Pham, 2022. "Deep Neural Networks Algorithms for Stochastic Control Problems on Finite Horizon: Numerical Applications," Methodology and Computing in Applied Probability, Springer, vol. 24(1), pages 143-178, March.
Al-Kanj, Lina & Nascimento, Juliana & Powell, Warren B., 2020. "Approximate dynamic programming for planning a ride-hailing system using autonomous fleets of electric vehicles," European Journal of Operational Research, Elsevier, vol. 284(3), pages 1088-1106.
Yin, Jiateng & Tang, Tao & Yang, Lixing & Gao, Ziyou & Ran, Bin, 2016. "Energy-efficient metro train rescheduling with uncertain time-variant passenger demands: An approximate dynamic programming approach," Transportation Research Part B: Methodological, Elsevier, vol. 91(C), pages 178-210.
Andersen, Jesper Fink & Andersen, Anders Reenberg & Kulahci, Murat & Nielsen, Bo Friis, 2022. "A numerical study of Markov decision process algorithms for multi-component replacement problems," European Journal of Operational Research, Elsevier, vol. 299(3), pages 898-909.
Achref Bachouch & C^ome Hur'e & Nicolas Langren'e & Huyen Pham, 2018. "Deep neural networks algorithms for stochastic control problems on finite horizon: numerical applications," Papers 1812.05916, arXiv.org, revised Jan 2020.
Sebastian Becker & Patrick Cheridito & Arnulf Jentzen & Timo Welti, 2019. "Solving high-dimensional optimal stopping problems using deep learning," Papers 1908.01602, arXiv.org, revised Aug 2021.
Ulmer, Marlin W. & Thomas, Barrett W., 2020. "Meso-parametric value function approximation for dynamic customer acceptances in delivery routing," European Journal of Operational Research, Elsevier, vol. 285(1), pages 183-195.

Most related items

These are the items that most often cite the same works as this one and are cited by the same works as this one.

Daniel R. Jiang & Warren B. Powell, 2015. "Optimal Hour-Ahead Bidding in the Real-Time Electricity Market with Battery Storage Using Approximate Dynamic Programming," INFORMS Journal on Computing, INFORMS, vol. 27(3), pages 525-543, August.
Anna Maria Gambaro & Nicola Secomandi, 2021. "A Discussion of Non‐Gaussian Price Processes for Energy and Commodity Operations," Production and Operations Management, Production and Operations Management Society, vol. 30(1), pages 47-67, January.
Secomandi, Nicola & Seppi, Duane J., 2014. "Real Options and Merchant Operations of Energy and Other Commodities," Foundations and Trends(R) in Technology, Information and Operations Management, now publishers, vol. 6(3-4), pages 161-331, July.
Saif Benjaafar & Daniel Jiang & Xiang Li & Xiaobo Li, 2022. "Dynamic Inventory Repositioning in On-Demand Rental Networks," Management Science, INFORMS, vol. 68(11), pages 7861-7878, November.
Yangfang (Helen) Zhou & Alan Scheller‐Wolf & Nicola Secomandi & Stephen Smith, 2019. "Managing Wind‐Based Electricity Generation in the Presence of Storage and Transmission Capacity," Production and Operations Management, Production and Operations Management Society, vol. 28(4), pages 970-989, April.
Ilya O. Ryzhov & Martijn R. K. Mes & Warren B. Powell & Gerald van den Berg, 2019. "Bayesian Exploration for Approximate Dynamic Programming," Operations Research, INFORMS, vol. 67(1), pages 198-214, January.
Jochen Gönsch & Michael Hassler, 2016. "Sell or store? An ADP approach to marketing renewable energy," OR Spectrum: Quantitative Approaches in Management, Springer;Gesellschaft für Operations Research e.V., vol. 38(3), pages 633-660, July.
Ekaterina Abramova & Derek Bunn, 2021. "Optimal Daily Trading of Battery Operations Using Arbitrage Spreads," Energies, MDPI, vol. 14(16), pages 1-23, August.
Benedikt Finnah, 2022. "Optimal bidding functions for renewable energies in sequential electricity markets," OR Spectrum: Quantitative Approaches in Management, Springer;Gesellschaft für Operations Research e.V., vol. 44(1), pages 1-27, March.
Sandeep Rath & Kumar Rajaram, 2022. "Staff Planning for Hospitals with Implicit Cost Estimation and Stochastic Optimization," Production and Operations Management, Production and Operations Management Society, vol. 31(3), pages 1271-1289, March.
Patrick J. Kehoe & Virgiliu Midrigan & Elena Pastorino, 2019. "Debt Constraints and Employment," Journal of Political Economy, University of Chicago Press, vol. 127(4), pages 1926-1991.
- Virgiliu Midrigan & Elena Pastorino & Patrick Kehoe, 2014. "Debt Constraints and Unemployment," 2014 Meeting Papers 1118, Society for Economic Dynamics.
- Patrick J. Kehoe & Virgiliu Midrigan & Elena Pastorino, 2016. "Debt Constraints and Employment," Staff Report 536, Federal Reserve Bank of Minneapolis.
- Patrick Kehoe & Elena Pastorino & Virgiliu Midrigan, 2016. "Debt Constraints and Employment," NBER Working Papers 22614, National Bureau of Economic Research, Inc.
Somayeh Moazeni & Warren B. Powell & Boris Defourny & Belgacem Bouzaiene-Ayari, 2017. "Parallel Nonstationary Direct Policy Search for Risk-Averse Stochastic Optimization," INFORMS Journal on Computing, INFORMS, vol. 29(2), pages 332-349, May.
Mark D. Manuszak & Charles F. Manski & Sanghamitra Das, 2005. "Walk or wait? An empirical analysis of street crossing decisions," Journal of Applied Econometrics, John Wiley & Sons, Ltd., vol. 20(4), pages 529-548.
- Sanghamitra Das & Charles F. Manski & Mark D. Manuszak, 2005. "Walk or wait? An empirical analysis of street crossing decisions," Journal of Applied Econometrics, John Wiley & Sons, Ltd., vol. 20(4), pages 529-548, May.
- Sanghamitra Das & Charles F. Manski & Mark D. Manuszak, 2003. "Walk or wait?: An empirical analysis of street crossing decisions," Discussion Papers 03-09, Indian Statistical Institute, Delhi.
- Mark Manuszak & Sanghamitra Das & Charles Manski, 2004. "Walk or Wait? An Empirical Analysis of Street Crossing Decisions," GSIA Working Papers 2003-41, Carnegie Mellon University, Tepper School of Business.
Nadarajah, Selvaprabu & Secomandi, Nicola, 2023. "A review of the operations literature on real options in energy," European Journal of Operational Research, Elsevier, vol. 309(2), pages 469-487.
Weitzel, Timm & Glock, Christoph H., 2018. "Energy management for stationary electric energy storage systems: A systematic literature review," European Journal of Operational Research, Elsevier, vol. 264(2), pages 582-606.
Navid Mojir & K. Sudhir, 2014. "Price Search Across Time and Across Stores," Cowles Foundation Discussion Papers 1942R, Cowles Foundation for Research in Economics, Yale University, revised Jul 2019.
Bastian Felix, 2012. "Gas Storage Valuation: A Comparative Simulation Study," EWL Working Papers 1201, University of Duisburg-Essen, Chair for Management Science and Energy Economics, revised Apr 2014.
Qingyin Ma & John Stachurski, 2019. "Dynamic Optimal Choice When Rewards are Unbounded Below," Papers 1911.13025, arXiv.org.
Manuel Arellano & Stéphane Bonhomme, 2017. "Nonlinear Panel Data Methods for Dynamic Heterogeneous Agent Models," Annual Review of Economics, Annual Reviews, vol. 9(1), pages 471-496, September.
- Manuel Arellano & Stéphane Bonhomme, 2016. "Nonlinear Panel Data Methods for Dynamic Heterogeneous Agent Models," Working Papers wp2016_1607, CEMFI.
- Manuel Arellano & Stéphane Bonhomme, 2017. "Nonlinear Panel Data Methods for Dynamic Heterogeneous Agent Models," Working Papers wp2017_1703, CEMFI.
- Manuel Arellano & Stéphane Bonhomme, 2016. "Nonlinear panel data methods for dynamic heterogeneous agent models," CeMMAP working papers CWP51/16, Centre for Microdata Methods and Practice, Institute for Fiscal Studies.
- Manuel Arellano & Stéphane Bonhomme, 2016. "Nonlinear panel data methods for dynamic heterogeneous agent models," CeMMAP working papers 51/16, Institute for Fiscal Studies.
Keane, Michael P. & Todd, Petra E. & Wolpin, Kenneth I., 2011. "The Structural Estimation of Behavioral Models: Discrete Choice Dynamic Programming Methods and Applications," Handbook of Labor Economics, in: O. Ashenfelter & D. Card (ed.), Handbook of Labor Economics, edition 1, volume 4, chapter 4, pages 331-461, Elsevier.

More about this item

Keywords

approximate dynamic programming; monotonicity; optimal stopping; energy storage; glycemic control;
All these keywords.

Statistics

Access and download statistics

Corrections

All material on this site has been provided by the respective publishers and authors. You can help correct errors and omissions. When requesting a correction, please mention this item's handle: RePEc:inm:oropre:v:63:y:2015:i:6:p:1489-1511. See general information about how to correct material in RePEc.

If you have authored this item and are not yet registered with RePEc, we encourage you to do it here. This allows to link your profile to this item. It also allows you to accept potential citations to this item that we are uncertain about.

If CitEc recognized a bibliographic reference but did not link an item in RePEc to it, you can help with this form .

If you know of missing items citing this one, you can help us creating those links by adding the relevant references in the same way as above, for each refering item. If you are a registered author of this item, you may also want to check the "citations" tab in your RePEc Author Service profile, as there may be some citations waiting for confirmation.

For technical questions regarding this item, or to correct its authors, title, abstract, bibliographic or download information, contact: Chris Asher (email available below). General contact details of provider: https://edirc.repec.org/data/inforea.html .

Please note that corrections may take a couple of weeks to filter through the various RePEc services.

IDEAS is a RePEc service. RePEc uses bibliographic data supplied by the respective publishers.

Browse Econ Literature

More features

An Approximate Dynamic Programming Algorithm for Monotone Value Functions

Author

Abstract

Suggested Citation

Download full text from publisher

References listed on IDEAS

Citations

Most related items

More about this item

Keywords

Statistics

Corrections

More services and features

MyIDEAS

Author registration

Rankings

RePEc Genealogy

RePEc Biblio

MPRA

New papers by email

EconAcademics

Plagiarism

About RePEc

RePEc home

Blog

Help/FAQ

RePEc team

Participating archives

Privacy statement

Help us

Corrections

Volunteers

Get papers listed

Open a RePEc archive

Get RePEc data