IDEAS home Printed from https://ideas.repec.org/a/inm/oropre/v60y2012i3p655-674.html
   My bibliography  Save this article

Approximate Dynamic Programming via a Smoothed Linear Program

Author

Listed:
  • Vijay V. Desai

    (Department of Industrial Engineering and Operations Research, Columbia University, New York, New York 10027)

  • Vivek F. Farias

    (Sloan School of Management, Massachusetts Institute of Technology, Cambridge, Massachusetts 02139)

  • Ciamac C. Moallemi

    (Graduate School of Business, Columbia University, New York, New York 10027)

Abstract

We present a novel linear program for the approximation of the dynamic programming cost-to-go function in high-dimensional stochastic control problems. LP approaches to approximate DP have typically relied on a natural “projection” of a well-studied linear program for exact dynamic programming. Such programs restrict attention to approximations that are lower bounds to the optimal cost-to-go function. Our program---the “smoothed approximate linear program”---is distinct from such approaches and relaxes the restriction to lower bounding approximations in an appropriate fashion while remaining computationally tractable. Doing so appears to have several advantages: First, we demonstrate bounds on the quality of approximation to the optimal cost-to-go function afforded by our approach. These bounds are, in general, no worse than those available for extant LP approaches and for specific problem instances can be shown to be arbitrarily stronger. Second, experiments with our approach on a pair of challenging problems (the game of Tetris and a queueing network control problem) show that the approach outperforms the existing LP approach (which has previously been shown to be competitive with several ADP algorithms) by a substantial margin.

Suggested Citation

  • Vijay V. Desai & Vivek F. Farias & Ciamac C. Moallemi, 2012. "Approximate Dynamic Programming via a Smoothed Linear Program," Operations Research, INFORMS, vol. 60(3), pages 655-674, June.
  • Handle: RePEc:inm:oropre:v:60:y:2012:i:3:p:655-674
    DOI: 10.1287/opre.1120.1044
    as

    Download full text from publisher

    File URL: http://dx.doi.org/10.1287/opre.1120.1044
    Download Restriction: no

    File URL: https://libkey.io/10.1287/opre.1120.1044?utm_source=ideas
    LibKey link: if access is restricted and if your library uses this service, LibKey will redirect you to where you can use your library subscription to access this item
    ---><---

    References listed on IDEAS

    as
    1. Daniela Pucci de Farias & Benjamin Van Roy, 2006. "A Cost-Shaping Linear Program for Average-Cost Approximate Dynamic Programming with Performance Guarantees," Mathematics of Operations Research, INFORMS, vol. 31(3), pages 597-620, August.
    2. Alan S. Manne, 1960. "Linear Programming and Sequential Decisions," Management Science, INFORMS, vol. 6(3), pages 259-267, April.
    3. Vivek Farias & Denis Saure & Gabriel Y. Weintraub, 2012. "An approximate dynamic programming approach to solving dynamic oligopoly models," RAND Journal of Economics, RAND Corporation, vol. 43(2), pages 253-282, June.
    4. Sunil Kumar & Kumar Muthuraman, 2004. "A Numerical Method for Solving Singular Stochastic Control Problems," Operations Research, INFORMS, vol. 52(4), pages 563-582, August.
    5. D. P. de Farias & B. Van Roy, 2003. "The Linear Programming Approach to Approximate Dynamic Programming," Operations Research, INFORMS, vol. 51(6), pages 850-865, December.
    6. D. P. De Farias & B. Van Roy, 2000. "On the Existence of Fixed Points for Approximate Value Iteration and Temporal-Difference Learning," Journal of Optimization Theory and Applications, Springer, vol. 105(3), pages 589-608, June.
    7. J. G. Dai & Wuqin Lin, 2005. "Maximum Pressure Policies in Stochastic Processing Networks," Operations Research, INFORMS, vol. 53(2), pages 197-218, April.
    8. Daniela Pucci de Farias & Benjamin Van Roy, 2004. "On Constraint Sampling in the Linear Programming Approach to Approximate Dynamic Programming," Mathematics of Operations Research, INFORMS, vol. 29(3), pages 462-478, August.
    Full references (including those not matched with items on IDEAS)

    Citations

    Citations are extracted by the CitEc Project, subscribe to its RSS feed for this item.
    as


    Cited by:

    1. Andre P. Calmon & Florin D. Ciocan & Gonzalo Romero, 2021. "Revenue Management with Repeated Customer Interactions," Management Science, INFORMS, vol. 67(5), pages 2944-2963, May.
    2. Alejandro Toriello & William B. Haskell & Michael Poremba, 2014. "A Dynamic Traveling Salesman Problem with Stochastic Arc Costs," Operations Research, INFORMS, vol. 62(5), pages 1107-1125, October.
    3. Vijay V. Desai & Vivek F. Farias & Ciamac C. Moallemi, 2012. "Pathwise Optimization for Optimal Stopping Problems," Management Science, INFORMS, vol. 58(12), pages 2292-2308, December.
    4. Selvaprabu Nadarajah & François Margot & Nicola Secomandi, 2015. "Relaxations of Approximate Linear Programs for the Real Option Management of Commodity Storage," Management Science, INFORMS, vol. 61(12), pages 3054-3076, December.
    5. Stephanie Carew & Mahesh Nagarajan & Steven Shechter & Jugpal Arneja & Erik Skarsgard, 2021. "Dynamic Capacity Allocation for Elective Surgeries: Reducing Urgency-Weighted Wait Times," Manufacturing & Service Operations Management, INFORMS, vol. 23(2), pages 407-424, March.
    6. Ohno, Katsuhisa & Boh, Toshitaka & Nakade, Koichi & Tamura, Takayoshi, 2016. "New approximate dynamic programming algorithms for large-scale undiscounted Markov decision processes and their application to optimize a production and distribution system," European Journal of Operational Research, Elsevier, vol. 249(1), pages 22-31.
    7. Selvaprabu Nadarajah & Andre A. Cire, 2020. "Network-Based Approximate Linear Programming for Discrete Optimization," Operations Research, INFORMS, vol. 68(6), pages 1767-1786, November.

    Most related items

    These are the items that most often cite the same works as this one and are cited by the same works as this one.
    1. Diego Klabjan & Daniel Adelman, 2007. "An Infinite-Dimensional Linear Programming Algorithm for Deterministic Semi-Markov Decision Processes on Borel Spaces," Mathematics of Operations Research, INFORMS, vol. 32(3), pages 528-550, August.
    2. Vijay V. Desai & Vivek F. Farias & Ciamac C. Moallemi, 2012. "Pathwise Optimization for Optimal Stopping Problems," Management Science, INFORMS, vol. 58(12), pages 2292-2308, December.
    3. Michael H. Veatch, 2013. "Approximate Linear Programming for Average Cost MDPs," Mathematics of Operations Research, INFORMS, vol. 38(3), pages 535-544, August.
    4. Schütz, Hans-Jörg & Kolisch, Rainer, 2012. "Approximate dynamic programming for capacity allocation in the service industry," European Journal of Operational Research, Elsevier, vol. 218(1), pages 239-250.
    5. Alejandro Toriello & William B. Haskell & Michael Poremba, 2014. "A Dynamic Traveling Salesman Problem with Stochastic Arc Costs," Operations Research, INFORMS, vol. 62(5), pages 1107-1125, October.
    6. Stefan Heinz & Jörg Rambau & Andreas Tuchscherer, 2014. "Computational bounds for elevator control policies by large scale linear programming," Mathematical Methods of Operations Research, Springer;Gesellschaft für Operations Research (GOR);Nederlands Genootschap voor Besliskunde (NGB), vol. 79(1), pages 87-117, February.
    7. Amin Khademi & Denis R. Saure & Andrew J. Schaefer & Ronald S. Braithwaite & Mark S. Roberts, 2015. "The Price of Nonabandonment: HIV in Resource-Limited Settings," Manufacturing & Service Operations Management, INFORMS, vol. 17(4), pages 554-570, October.
    8. Ken Moon & Patrick Bergemann & Daniel Brown & Andrew Chen & James Chu & Ellen A. Eisen & Gregory M. Fischer & Prashant Loyalka & Sungmin Rho & Joshua Cohen, 2023. "Manufacturing Productivity with Worker Turnover," Management Science, INFORMS, vol. 69(4), pages 1995-2015, April.
    9. Meissner, Joern & Strauss, Arne, 2012. "Network revenue management with inventory-sensitive bid prices and customer choice," European Journal of Operational Research, Elsevier, vol. 216(2), pages 459-468.
    10. Thomas W. M. Vossen & Dan Zhang, 2015. "Reductions of Approximate Linear Programs for Network Revenue Management," Operations Research, INFORMS, vol. 63(6), pages 1352-1371, December.
    11. Höfferl, F. & Steinschorn, D., 2009. "A dynamic programming extension to the steady state refinery-LP," European Journal of Operational Research, Elsevier, vol. 197(2), pages 465-474, September.
    12. Melda Ormeci Matoglu & John Vande Vate, 2011. "Drift Control with Changeover Costs," Operations Research, INFORMS, vol. 59(2), pages 427-439, April.
    13. Daniel Adelman & Adam J. Mersereau, 2008. "Relaxations of Weakly Coupled Stochastic Dynamic Programs," Operations Research, INFORMS, vol. 56(3), pages 712-727, June.
    14. Daniel Adelman & Diego Klabjan, 2012. "Computing Near-Optimal Policies in Generalized Joint Replenishment," INFORMS Journal on Computing, INFORMS, vol. 24(1), pages 148-164, February.
    15. D. P. de Farias & B. Van Roy, 2003. "The Linear Programming Approach to Approximate Dynamic Programming," Operations Research, INFORMS, vol. 51(6), pages 850-865, December.
    16. Qihang Lin & Selvaprabu Nadarajah & Negar Soheili, 2020. "Revisiting Approximate Linear Programming: Constraint-Violation Learning with Applications to Inventory Control and Energy Storage," Management Science, INFORMS, vol. 66(4), pages 1544-1562, April.
    17. Dimitris Bertsimas & Velibor V. Mišić, 2016. "Decomposable Markov Decision Processes: A Fluid Optimization Approach," Operations Research, INFORMS, vol. 64(6), pages 1537-1555, December.
    18. Benjamin Van Roy, 2006. "Performance Loss Bounds for Approximate Value Iteration with State Aggregation," Mathematics of Operations Research, INFORMS, vol. 31(2), pages 234-244, May.
    19. Antoine Sauré & Jonathan Patrick & Martin L. Puterman, 2015. "Simulation-Based Approximate Policy Iteration with Generalized Logistic Functions," INFORMS Journal on Computing, INFORMS, vol. 27(3), pages 579-595, August.
    20. Laumer, Simon & Barz, Christiane, 2023. "Reductions of non-separable approximate linear programs for network revenue management," European Journal of Operational Research, Elsevier, vol. 309(1), pages 252-270.

    Corrections

    All material on this site has been provided by the respective publishers and authors. You can help correct errors and omissions. When requesting a correction, please mention this item's handle: RePEc:inm:oropre:v:60:y:2012:i:3:p:655-674. See general information about how to correct material in RePEc.

    If you have authored this item and are not yet registered with RePEc, we encourage you to do it here. This allows to link your profile to this item. It also allows you to accept potential citations to this item that we are uncertain about.

    If CitEc recognized a bibliographic reference but did not link an item in RePEc to it, you can help with this form .

    If you know of missing items citing this one, you can help us creating those links by adding the relevant references in the same way as above, for each refering item. If you are a registered author of this item, you may also want to check the "citations" tab in your RePEc Author Service profile, as there may be some citations waiting for confirmation.

    For technical questions regarding this item, or to correct its authors, title, abstract, bibliographic or download information, contact: Chris Asher (email available below). General contact details of provider: https://edirc.repec.org/data/inforea.html .

    Please note that corrections may take a couple of weeks to filter through the various RePEc services.

    IDEAS is a RePEc service. RePEc uses bibliographic data supplied by the respective publishers.