Approximate Dynamic Programming via a Smoothed Linear Program

My bibliography Save this article

Approximate Dynamic Programming via a Smoothed Linear Program

Author

Listed:

Vijay V. Desai
(Department of Industrial Engineering and Operations Research, Columbia University, New York, New York 10027)
Vivek F. Farias
(Sloan School of Management, Massachusetts Institute of Technology, Cambridge, Massachusetts 02139)
Ciamac C. Moallemi
(Graduate School of Business, Columbia University, New York, New York 10027)

Registered:

Abstract

We present a novel linear program for the approximation of the dynamic programming cost-to-go function in high-dimensional stochastic control problems. LP approaches to approximate DP have typically relied on a natural “projection” of a well-studied linear program for exact dynamic programming. Such programs restrict attention to approximations that are lower bounds to the optimal cost-to-go function. Our program---the “smoothed approximate linear program”---is distinct from such approaches and relaxes the restriction to lower bounding approximations in an appropriate fashion while remaining computationally tractable. Doing so appears to have several advantages: First, we demonstrate bounds on the quality of approximation to the optimal cost-to-go function afforded by our approach. These bounds are, in general, no worse than those available for extant LP approaches and for specific problem instances can be shown to be arbitrarily stronger. Second, experiments with our approach on a pair of challenging problems (the game of Tetris and a queueing network control problem) show that the approach outperforms the existing LP approach (which has previously been shown to be competitive with several ADP algorithms) by a substantial margin.

Suggested Citation

Vijay V. Desai & Vivek F. Farias & Ciamac C. Moallemi, 2012. "Approximate Dynamic Programming via a Smoothed Linear Program," Operations Research, INFORMS, vol. 60(3), pages 655-674, June.

Handle: RePEc:inm:oropre:v:60:y:2012:i:3:p:655-674
DOI: 10.1287/opre.1120.1044

Download full text from publisher

References listed on IDEAS

Vivek Farias & Denis Saure & Gabriel Y. Weintraub, 2012. "An approximate dynamic programming approach to solving dynamic oligopoly models," RAND Journal of Economics, RAND Corporation, vol. 43(2), pages 253-282, June.
Daniela Pucci de Farias & Benjamin Van Roy, 2004. "On Constraint Sampling in the Linear Programming Approach to Approximate Dynamic Programming," Mathematics of Operations Research, INFORMS, vol. 29(3), pages 462-478, August.
Sunil Kumar & Kumar Muthuraman, 2004. "A Numerical Method for Solving Singular Stochastic Control Problems," Operations Research, INFORMS, vol. 52(4), pages 563-582, August.
Daniela Pucci de Farias & Benjamin Van Roy, 2006. "A Cost-Shaping Linear Program for Average-Cost Approximate Dynamic Programming with Performance Guarantees," Mathematics of Operations Research, INFORMS, vol. 31(3), pages 597-620, August.
Alan S. Manne, 1960. "Linear Programming and Sequential Decisions," Management Science, INFORMS, vol. 6(3), pages 259-267, April.
D. P. De Farias & B. Van Roy, 2000. "On the Existence of Fixed Points for Approximate Value Iteration and Temporal-Difference Learning," Journal of Optimization Theory and Applications, Springer, vol. 105(3), pages 589-608, June.
J. G. Dai & Wuqin Lin, 2005. "Maximum Pressure Policies in Stochastic Processing Networks," Operations Research, INFORMS, vol. 53(2), pages 197-218, April.
D. P. de Farias & B. Van Roy, 2003. "The Linear Programming Approach to Approximate Dynamic Programming," Operations Research, INFORMS, vol. 51(6), pages 850-865, December.

Full references (including those not matched with items on IDEAS)

Citations

Citations are extracted by the CitEc Project, subscribe to its RSS feed for this item.

Cited by:

Andre P. Calmon & Florin D. Ciocan & Gonzalo Romero, 2021. "Revenue Management with Repeated Customer Interactions," Management Science, INFORMS, vol. 67(5), pages 2944-2963, May.
Alejandro Toriello & William B. Haskell & Michael Poremba, 2014. "A Dynamic Traveling Salesman Problem with Stochastic Arc Costs," Operations Research, INFORMS, vol. 62(5), pages 1107-1125, October.
Selvaprabu Nadarajah & François Margot & Nicola Secomandi, 2015. "Relaxations of Approximate Linear Programs for the Real Option Management of Commodity Storage," Management Science, INFORMS, vol. 61(12), pages 3054-3076, December.
Stephanie Carew & Mahesh Nagarajan & Steven Shechter & Jugpal Arneja & Erik Skarsgard, 2021. "Dynamic Capacity Allocation for Elective Surgeries: Reducing Urgency-Weighted Wait Times," Manufacturing & Service Operations Management, INFORMS, vol. 23(2), pages 407-424, March.
Vijay V. Desai & Vivek F. Farias & Ciamac C. Moallemi, 2012. "Pathwise Optimization for Optimal Stopping Problems," Management Science, INFORMS, vol. 58(12), pages 2292-2308, December.
Ohno, Katsuhisa & Boh, Toshitaka & Nakade, Koichi & Tamura, Takayoshi, 2016. "New approximate dynamic programming algorithms for large-scale undiscounted Markov decision processes and their application to optimize a production and distribution system," European Journal of Operational Research, Elsevier, vol. 249(1), pages 22-31.
Selvaprabu Nadarajah & Andre A. Cire, 2020. "Network-Based Approximate Linear Programming for Discrete Optimization," Operations Research, INFORMS, vol. 68(6), pages 1767-1786, November.

Most related items

These are the items that most often cite the same works as this one and are cited by the same works as this one.

Vijay V. Desai & Vivek F. Farias & Ciamac C. Moallemi, 2012. "Pathwise Optimization for Optimal Stopping Problems," Management Science, INFORMS, vol. 58(12), pages 2292-2308, December.
Michael H. Veatch, 2013. "Approximate Linear Programming for Average Cost MDPs," Mathematics of Operations Research, INFORMS, vol. 38(3), pages 535-544, August.
Diego Klabjan & Daniel Adelman, 2007. "An Infinite-Dimensional Linear Programming Algorithm for Deterministic Semi-Markov Decision Processes on Borel Spaces," Mathematics of Operations Research, INFORMS, vol. 32(3), pages 528-550, August.
Fan You & Thomas Vossen, 2024. "An Approximate Dynamic Programming Approach to Dynamic Stochastic Matching," INFORMS Journal on Computing, INFORMS, vol. 36(4), pages 1006-1022, July.
Schütz, Hans-Jörg & Kolisch, Rainer, 2012. "Approximate dynamic programming for capacity allocation in the service industry," European Journal of Operational Research, Elsevier, vol. 218(1), pages 239-250.
Alejandro Toriello & William B. Haskell & Michael Poremba, 2014. "A Dynamic Traveling Salesman Problem with Stochastic Arc Costs," Operations Research, INFORMS, vol. 62(5), pages 1107-1125, October.
Stefan Heinz & Jörg Rambau & Andreas Tuchscherer, 2014. "Computational bounds for elevator control policies by large scale linear programming," Mathematical Methods of Operations Research, Springer;Gesellschaft für Operations Research (GOR);Nederlands Genootschap voor Besliskunde (NGB), vol. 79(1), pages 87-117, February.
Amin Khademi & Denis R. Saure & Andrew J. Schaefer & Ronald S. Braithwaite & Mark S. Roberts, 2015. "The Price of Nonabandonment: HIV in Resource-Limited Settings," Manufacturing & Service Operations Management, INFORMS, vol. 17(4), pages 554-570, October.
Ken Moon & Patrick Bergemann & Daniel Brown & Andrew Chen & James Chu & Ellen A. Eisen & Gregory M. Fischer & Prashant Loyalka & Sungmin Rho & Joshua Cohen, 2023. "Manufacturing Productivity with Worker Turnover," Management Science, INFORMS, vol. 69(4), pages 1995-2015, April.
Thomas W. M. Vossen & Dan Zhang, 2015. "Reductions of Approximate Linear Programs for Network Revenue Management," Operations Research, INFORMS, vol. 63(6), pages 1352-1371, December.
Melda Ormeci Matoglu & John Vande Vate, 2011. "Drift Control with Changeover Costs," Operations Research, INFORMS, vol. 59(2), pages 427-439, April.
Meissner, Joern & Strauss, Arne, 2012. "Network revenue management with inventory-sensitive bid prices and customer choice," European Journal of Operational Research, Elsevier, vol. 216(2), pages 459-468.
- Joern Meissner & Arne Strauss, 2008. "Network Revenue Management with Inventory-Sensitive Bid Prices and Customer Choice," Working Papers MRG/0008, Department of Management Science, Lancaster University, revised Apr 2010.
Daniel Adelman & Adam J. Mersereau, 2008. "Relaxations of Weakly Coupled Stochastic Dynamic Programs," Operations Research, INFORMS, vol. 56(3), pages 712-727, June.
D. P. de Farias & B. Van Roy, 2003. "The Linear Programming Approach to Approximate Dynamic Programming," Operations Research, INFORMS, vol. 51(6), pages 850-865, December.
Dimitris Bertsimas & Velibor V. Mišić, 2016. "Decomposable Markov Decision Processes: A Fluid Optimization Approach," Operations Research, INFORMS, vol. 64(6), pages 1537-1555, December.
Benjamin Van Roy, 2006. "Performance Loss Bounds for Approximate Value Iteration with State Aggregation," Mathematics of Operations Research, INFORMS, vol. 31(2), pages 234-244, May.
Laumer, Simon & Barz, Christiane, 2023. "Reductions of non-separable approximate linear programs for network revenue management," European Journal of Operational Research, Elsevier, vol. 309(1), pages 252-270.
Marquinez, José Tomás & Sauré, Antoine & Cataldo, Alejandro & Ferrer, Juan-Carlos, 2021. "Identifying proactive ICU patient admission, transfer and diversion policies in a public-private hospital network," European Journal of Operational Research, Elsevier, vol. 295(1), pages 306-320.
Andre P. Calmon & Florin D. Ciocan & Gonzalo Romero, 2021. "Revenue Management with Repeated Customer Interactions," Management Science, INFORMS, vol. 67(5), pages 2944-2963, May.
Woerner, Stefan & Laumanns, Marco & Zenklusen, Rico & Fertis, Apostolos, 2015. "Approximate dynamic programming for stochastic linear control problems on compact state spaces," European Journal of Operational Research, Elsevier, vol. 241(1), pages 85-98.

More about this item

Keywords

optimization; linear programming; stochastic control; Markov decision processes; approximate dynamic programming;
All these keywords.

Statistics

Access and download statistics

Corrections

All material on this site has been provided by the respective publishers and authors. You can help correct errors and omissions. When requesting a correction, please mention this item's handle: RePEc:inm:oropre:v:60:y:2012:i:3:p:655-674. See general information about how to correct material in RePEc.

If you have authored this item and are not yet registered with RePEc, we encourage you to do it here. This allows to link your profile to this item. It also allows you to accept potential citations to this item that we are uncertain about.

If CitEc recognized a bibliographic reference but did not link an item in RePEc to it, you can help with this form .

If you know of missing items citing this one, you can help us creating those links by adding the relevant references in the same way as above, for each refering item. If you are a registered author of this item, you may also want to check the "citations" tab in your RePEc Author Service profile, as there may be some citations waiting for confirmation.

For technical questions regarding this item, or to correct its authors, title, abstract, bibliographic or download information, contact: Chris Asher (email available below). General contact details of provider: https://edirc.repec.org/data/inforea.html .

Please note that corrections may take a couple of weeks to filter through the various RePEc services.

IDEAS is a RePEc service. RePEc uses bibliographic data supplied by the respective publishers.

Browse Econ Literature

More features

Approximate Dynamic Programming via a Smoothed Linear Program

Author

Abstract

Suggested Citation

Download full text from publisher

References listed on IDEAS

Citations

Most related items

More about this item

Keywords

Statistics

Corrections

More services and features

MyIDEAS

Author registration

Rankings

RePEc Genealogy

RePEc Biblio

MPRA

New papers by email

EconAcademics

Plagiarism

About RePEc

RePEc home

Blog

Help/FAQ

RePEc team

Participating archives

Privacy statement

Help us

Corrections

Volunteers

Get papers listed

Open a RePEc archive

Get RePEc data