IDEAS home Printed from https://ideas.repec.org/a/inm/oropre/v63y2015i6p1489-1511.html
   My bibliography  Save this article

An Approximate Dynamic Programming Algorithm for Monotone Value Functions

Author

Listed:
  • Daniel R. Jiang

    (Department of Operations Research and Financial Engineering, Princeton University, Princeton, New Jersey 08540)

  • Warren B. Powell

    (Department of Operations Research and Financial Engineering, Princeton University, Princeton, New Jersey 08540)

Abstract

Many sequential decision problems can be formulated as Markov decision processes (MDPs) where the optimal value function (or cost-to-go function) can be shown to satisfy a monotone structure in some or all of its dimensions. When the state space becomes large, traditional techniques, such as the backward dynamic programming algorithm (i.e., backward induction or value iteration), may no longer be effective in finding a solution within a reasonable time frame, and thus we are forced to consider other approaches, such as approximate dynamic programming (ADP). We propose a provably convergent ADP algorithm called Monotone-ADP that exploits the monotonicity of the value functions to increase the rate of convergence. In this paper, we describe a general finite-horizon problem setting where the optimal value function is monotone, present a convergence proof for Monotone-ADP under various technical assumptions, and show numerical results for three application domains: optimal stopping , energy storage / allocation , and glycemic control for diabetes patients . The empirical results indicate that by taking advantage of monotonicity, we can attain high quality solutions within a relatively small number of iterations, using up to two orders of magnitude less computation than is needed to compute the optimal solution exactly.

Suggested Citation

  • Daniel R. Jiang & Warren B. Powell, 2015. "An Approximate Dynamic Programming Algorithm for Monotone Value Functions," Operations Research, INFORMS, vol. 63(6), pages 1489-1511, December.
  • Handle: RePEc:inm:oropre:v:63:y:2015:i:6:p:1489-1511
    DOI: 10.1287/opre.2015.1425
    as

    Download full text from publisher

    File URL: http://dx.doi.org/10.1287/opre.2015.1425
    Download Restriction: no

    File URL: https://libkey.io/10.1287/opre.2015.1425?utm_source=ideas
    LibKey link: if access is restricted and if your library uses this service, LibKey will redirect you to where you can use your library subscription to access this item
    ---><---

    References listed on IDEAS

    as
    1. Rust, John, 1987. "Optimal Replacement of GMC Bus Engines: An Empirical Model of Harold Zurcher," Econometrica, Econometric Society, vol. 55(5), pages 999-1033, September.
    2. J. O. Ramsay, 1998. "Estimating smooth monotone functions," Journal of the Royal Statistical Society Series B, Royal Statistical Society, vol. 60(2), pages 365-375.
    3. Feldstein, Martin S & Rothschild, Michael, 1974. "Towards an Economic Theory of Replacement Investment," Econometrica, Econometric Society, vol. 42(3), pages 393-423, May.
    4. Mason, J.E. & Denton, B.T. & Shah, N.D. & Smith, S.A., 2014. "Optimizing the simultaneous management of blood pressure and cholesterol for type 2 diabetes patients," European Journal of Operational Research, Elsevier, vol. 233(3), pages 727-738.
    5. Alfred Müller, 1997. "How Does the Value Function of a Markov Decision Process Depend on the Transition Probabilities?," Mathematics of Operations Research, INFORMS, vol. 22(4), pages 872-885, November.
    6. James E. Smith & Kevin F. McCardle, 2002. "Structural Properties of Stochastic Dynamic Programs," Operations Research, INFORMS, vol. 50(5), pages 796-809, October.
    7. Jae Ho Kim & Warren B. Powell, 2011. "Optimal Energy Commitments with Storage and Intermittent Supply," Operations Research, INFORMS, vol. 59(6), pages 1347-1360, December.
    8. J. J. McCall, 1970. "Economics of Information and Job Search," The Quarterly Journal of Economics, President and Fellows of Harvard College, vol. 84(1), pages 113-126.
    9. Greg Kaplan & Giovanni L. Violante, 2014. "A Model of the Consumption Response to Fiscal Stimulus Payments," Econometrica, Econometric Society, vol. 82(4), pages 1199-1239, July.
    10. Juliana M. Nascimento & Warren B. Powell, 2009. "An Optimal Approximate Dynamic Programming Algorithm for the Lagged Asset Acquisition Problem," Mathematics of Operations Research, INFORMS, vol. 34(1), pages 210-237, February.
    11. Daniel R. Jiang & Warren B. Powell, 2015. "Optimal Hour-Ahead Bidding in the Real-Time Electricity Market with Battery Storage Using Approximate Dynamic Programming," INFORMS Journal on Computing, INFORMS, vol. 27(3), pages 525-543, August.
    12. Warren Powell & Andrzej Ruszczyński & Huseyin Topaloglu, 2004. "Learning Algorithms for Separable Approximations of Discrete Stochastic Optimization Problems," Mathematics of Operations Research, INFORMS, vol. 29(4), pages 814-836, November.
    13. Jennifer E. Mason & Darin A. England & Brian T. Denton & Steven A. Smith & Murat Kurt & Nilay D. Shah, 2012. "Optimizing Statin Treatment Decisions for Diabetes Patients in the Presence of Uncertain Future Adherence," Medical Decision Making, , vol. 32(1), pages 154-166, January.
    14. Nicola Secomandi, 2010. "Optimal Commodity Trading with a Capacitated Storage Asset," Management Science, INFORMS, vol. 56(3), pages 449-467, March.
    15. Rene Carmona & Michael Ludkovski, 2010. "Valuation of energy storage: an optimal switching approach," Quantitative Finance, Taylor & Francis Journals, vol. 10(4), pages 359-374.
    16. Juliana Nascimento & Warren Powell, 2010. "Dynamic Programming Models and Algorithms for the Mutual Fund Cash Balance Problem," Management Science, INFORMS, vol. 56(5), pages 801-815, May.
    17. Papadaki, Katerina P. & Powell, Warren B., 2002. "Exploiting structure in adaptive dynamic programming algorithms for a stochastic batch service problem," European Journal of Operational Research, Elsevier, vol. 142(1), pages 108-127, October.
    18. John R. Birge, 1985. "Decomposition and Partitioning Methods for Multistage Stochastic Linear Programs," Operations Research, INFORMS, vol. 33(5), pages 989-1007, October.
    Full references (including those not matched with items on IDEAS)

    Citations

    Citations are extracted by the CitEc Project, subscribe to its RSS feed for this item.
    as


    Cited by:

    1. Achref Bachouch & Côme Huré & Nicolas Langrené & Huyen Pham, 2020. "Deep neural networks algorithms for stochastic control problems on finite horizon: numerical applications," Post-Print hal-01949221, HAL.
    2. Daniel R. Jiang & Warren B. Powell, 2018. "Risk-Averse Approximate Dynamic Programming with Quantile-Based Risk Measures," Mathematics of Operations Research, INFORMS, vol. 43(2), pages 554-579, May.
    3. Fokkema, Jan Eise & uit het Broek, Michiel A.J. & Schrotenboer, Albert H. & Land, Martin J. & Van Foreest, Nicky D., 2022. "Seasonal hydrogen storage decisions under constrained electricity distribution capacity," Renewable Energy, Elsevier, vol. 195(C), pages 76-91.
    4. Daniel R. Jiang & Warren B. Powell, 2015. "Optimal Hour-Ahead Bidding in the Real-Time Electricity Market with Battery Storage Using Approximate Dynamic Programming," INFORMS Journal on Computing, INFORMS, vol. 27(3), pages 525-543, August.
    5. Antoine Jacquier & Hao Liu, 2017. "Optimal liquidation in a Level-I limit order book for large tick stocks," Papers 1701.01327, arXiv.org, revised Nov 2017.
    6. Andrew J. Collins & Patrick Hester & Barry Ezell & John Horst, 2016. "An improvement selection methodology for key performance indicators," Environment Systems and Decisions, Springer, vol. 36(2), pages 196-208, June.
    7. Achref Bachouch & Côme Huré & Nicolas Langrené & Huyen Pham, 2019. "Deep neural networks algorithms for stochastic control problems on finite horizon: numerical applications," Working Papers hal-01949221, HAL.
    8. Weitzel, Timm & Glock, Christoph H., 2018. "Energy management for stationary electric energy storage systems: A systematic literature review," European Journal of Operational Research, Elsevier, vol. 264(2), pages 582-606.
    9. Achref Bachouch & Côme Huré & Nicolas Langrené & Huyên Pham, 2022. "Deep Neural Networks Algorithms for Stochastic Control Problems on Finite Horizon: Numerical Applications," Methodology and Computing in Applied Probability, Springer, vol. 24(1), pages 143-178, March.
    10. Al-Kanj, Lina & Nascimento, Juliana & Powell, Warren B., 2020. "Approximate dynamic programming for planning a ride-hailing system using autonomous fleets of electric vehicles," European Journal of Operational Research, Elsevier, vol. 284(3), pages 1088-1106.
    11. Yin, Jiateng & Tang, Tao & Yang, Lixing & Gao, Ziyou & Ran, Bin, 2016. "Energy-efficient metro train rescheduling with uncertain time-variant passenger demands: An approximate dynamic programming approach," Transportation Research Part B: Methodological, Elsevier, vol. 91(C), pages 178-210.
    12. Andersen, Jesper Fink & Andersen, Anders Reenberg & Kulahci, Murat & Nielsen, Bo Friis, 2022. "A numerical study of Markov decision process algorithms for multi-component replacement problems," European Journal of Operational Research, Elsevier, vol. 299(3), pages 898-909.
    13. Achref Bachouch & C^ome Hur'e & Nicolas Langren'e & Huyen Pham, 2018. "Deep neural networks algorithms for stochastic control problems on finite horizon: numerical applications," Papers 1812.05916, arXiv.org, revised Jan 2020.
    14. Sebastian Becker & Patrick Cheridito & Arnulf Jentzen & Timo Welti, 2019. "Solving high-dimensional optimal stopping problems using deep learning," Papers 1908.01602, arXiv.org, revised Aug 2021.
    15. Ulmer, Marlin W. & Thomas, Barrett W., 2020. "Meso-parametric value function approximation for dynamic customer acceptances in delivery routing," European Journal of Operational Research, Elsevier, vol. 285(1), pages 183-195.

    Most related items

    These are the items that most often cite the same works as this one and are cited by the same works as this one.
    1. Daniel R. Jiang & Warren B. Powell, 2015. "Optimal Hour-Ahead Bidding in the Real-Time Electricity Market with Battery Storage Using Approximate Dynamic Programming," INFORMS Journal on Computing, INFORMS, vol. 27(3), pages 525-543, August.
    2. Anna Maria Gambaro & Nicola Secomandi, 2021. "A Discussion of Non‐Gaussian Price Processes for Energy and Commodity Operations," Production and Operations Management, Production and Operations Management Society, vol. 30(1), pages 47-67, January.
    3. Secomandi, Nicola & Seppi, Duane J., 2014. "Real Options and Merchant Operations of Energy and Other Commodities," Foundations and Trends(R) in Technology, Information and Operations Management, now publishers, vol. 6(3-4), pages 161-331, July.
    4. Saif Benjaafar & Daniel Jiang & Xiang Li & Xiaobo Li, 2022. "Dynamic Inventory Repositioning in On-Demand Rental Networks," Management Science, INFORMS, vol. 68(11), pages 7861-7878, November.
    5. Yangfang (Helen) Zhou & Alan Scheller‐Wolf & Nicola Secomandi & Stephen Smith, 2019. "Managing Wind‐Based Electricity Generation in the Presence of Storage and Transmission Capacity," Production and Operations Management, Production and Operations Management Society, vol. 28(4), pages 970-989, April.
    6. Ilya O. Ryzhov & Martijn R. K. Mes & Warren B. Powell & Gerald van den Berg, 2019. "Bayesian Exploration for Approximate Dynamic Programming," Operations Research, INFORMS, vol. 67(1), pages 198-214, January.
    7. Jochen Gönsch & Michael Hassler, 2016. "Sell or store? An ADP approach to marketing renewable energy," OR Spectrum: Quantitative Approaches in Management, Springer;Gesellschaft für Operations Research e.V., vol. 38(3), pages 633-660, July.
    8. Ekaterina Abramova & Derek Bunn, 2021. "Optimal Daily Trading of Battery Operations Using Arbitrage Spreads," Energies, MDPI, vol. 14(16), pages 1-23, August.
    9. Benedikt Finnah, 2022. "Optimal bidding functions for renewable energies in sequential electricity markets," OR Spectrum: Quantitative Approaches in Management, Springer;Gesellschaft für Operations Research e.V., vol. 44(1), pages 1-27, March.
    10. Sandeep Rath & Kumar Rajaram, 2022. "Staff Planning for Hospitals with Implicit Cost Estimation and Stochastic Optimization," Production and Operations Management, Production and Operations Management Society, vol. 31(3), pages 1271-1289, March.
    11. Patrick J. Kehoe & Virgiliu Midrigan & Elena Pastorino, 2019. "Debt Constraints and Employment," Journal of Political Economy, University of Chicago Press, vol. 127(4), pages 1926-1991.
    12. Somayeh Moazeni & Warren B. Powell & Boris Defourny & Belgacem Bouzaiene-Ayari, 2017. "Parallel Nonstationary Direct Policy Search for Risk-Averse Stochastic Optimization," INFORMS Journal on Computing, INFORMS, vol. 29(2), pages 332-349, May.
    13. Mark D. Manuszak & Charles F. Manski & Sanghamitra Das, 2005. "Walk or wait? An empirical analysis of street crossing decisions," Journal of Applied Econometrics, John Wiley & Sons, Ltd., vol. 20(4), pages 529-548.
    14. Nadarajah, Selvaprabu & Secomandi, Nicola, 2023. "A review of the operations literature on real options in energy," European Journal of Operational Research, Elsevier, vol. 309(2), pages 469-487.
    15. Weitzel, Timm & Glock, Christoph H., 2018. "Energy management for stationary electric energy storage systems: A systematic literature review," European Journal of Operational Research, Elsevier, vol. 264(2), pages 582-606.
    16. Navid Mojir & K. Sudhir, 2014. "Price Search Across Time and Across Stores," Cowles Foundation Discussion Papers 1942R, Cowles Foundation for Research in Economics, Yale University, revised Jul 2019.
    17. Bastian Felix, 2012. "Gas Storage Valuation: A Comparative Simulation Study," EWL Working Papers 1201, University of Duisburg-Essen, Chair for Management Science and Energy Economics, revised Apr 2014.
    18. Qingyin Ma & John Stachurski, 2019. "Dynamic Optimal Choice When Rewards are Unbounded Below," Papers 1911.13025, arXiv.org.
    19. Manuel Arellano & Stéphane Bonhomme, 2017. "Nonlinear Panel Data Methods for Dynamic Heterogeneous Agent Models," Annual Review of Economics, Annual Reviews, vol. 9(1), pages 471-496, September.
    20. Keane, Michael P. & Todd, Petra E. & Wolpin, Kenneth I., 2011. "The Structural Estimation of Behavioral Models: Discrete Choice Dynamic Programming Methods and Applications," Handbook of Labor Economics, in: O. Ashenfelter & D. Card (ed.), Handbook of Labor Economics, edition 1, volume 4, chapter 4, pages 331-461, Elsevier.

    Corrections

    All material on this site has been provided by the respective publishers and authors. You can help correct errors and omissions. When requesting a correction, please mention this item's handle: RePEc:inm:oropre:v:63:y:2015:i:6:p:1489-1511. See general information about how to correct material in RePEc.

    If you have authored this item and are not yet registered with RePEc, we encourage you to do it here. This allows to link your profile to this item. It also allows you to accept potential citations to this item that we are uncertain about.

    If CitEc recognized a bibliographic reference but did not link an item in RePEc to it, you can help with this form .

    If you know of missing items citing this one, you can help us creating those links by adding the relevant references in the same way as above, for each refering item. If you are a registered author of this item, you may also want to check the "citations" tab in your RePEc Author Service profile, as there may be some citations waiting for confirmation.

    For technical questions regarding this item, or to correct its authors, title, abstract, bibliographic or download information, contact: Chris Asher (email available below). General contact details of provider: https://edirc.repec.org/data/inforea.html .

    Please note that corrections may take a couple of weeks to filter through the various RePEc services.

    IDEAS is a RePEc service. RePEc uses bibliographic data supplied by the respective publishers.