IDEAS home Printed from https://ideas.repec.org/a/eee/ejores/v249y2016i1p22-31.html
   My bibliography  Save this article

New approximate dynamic programming algorithms for large-scale undiscounted Markov decision processes and their application to optimize a production and distribution system

Author

Listed:
  • Ohno, Katsuhisa
  • Boh, Toshitaka
  • Nakade, Koichi
  • Tamura, Takayoshi

Abstract

Undiscounted Markov decision processes (UMDP's) can formulate optimal stochastic control problems that minimize the expected total cost per period for various systems. We propose new approximate dynamic programming (ADP) algorithms for large-scale UMDP's that can solve the curses of dimensionality. These algorithms, called simulation-based modified policy iteration (SBMPI) algorithms, are extensions of the simulation-based modified policy iteration method (SBMPIM) (Ohno, 2011) for optimal control problems of multistage JIT-based production and distribution systems with stochastic demand and production capacity. The main new concepts of the SBMPI algorithms are that the simulation-based policy evaluation step of the SBMPIM is replaced by the partial policy evaluation step of the modified policy iteration method (MPIM) and that the algorithms starts from the expected total cost per period and relative value estimated by simulating the system under a reasonable initial policy.

Suggested Citation

  • Ohno, Katsuhisa & Boh, Toshitaka & Nakade, Koichi & Tamura, Takayoshi, 2016. "New approximate dynamic programming algorithms for large-scale undiscounted Markov decision processes and their application to optimize a production and distribution system," European Journal of Operational Research, Elsevier, vol. 249(1), pages 22-31.
  • Handle: RePEc:eee:ejores:v:249:y:2016:i:1:p:22-31
    DOI: 10.1016/j.ejor.2015.07.026
    as

    Download full text from publisher

    File URL: http://www.sciencedirect.com/science/article/pii/S0377221715006591
    Download Restriction: Full text for ScienceDirect subscribers only

    File URL: https://libkey.io/10.1016/j.ejor.2015.07.026?utm_source=ideas
    LibKey link: if access is restricted and if your library uses this service, LibKey will redirect you to where you can use your library subscription to access this item
    ---><---

    As the access to this document is restricted, you may want to search for a different version of it.

    References listed on IDEAS

    as
    1. Andrew J. Clark & Herbert Scarf, 2004. "Optimal Policies for a Multi-Echelon Inventory Problem," Management Science, INFORMS, vol. 50(12_supple), pages 1782-1790, December.
    2. Ohno, Katsuhisa, 2011. "The optimal control of just-in-time-based production and distribution systems and performance comparisons with optimized pull systems," European Journal of Operational Research, Elsevier, vol. 213(1), pages 124-133, August.
    3. Richard Bellman, 1957. "On a Dynamic Programming Approach to the Caterer Problem--I," Management Science, INFORMS, vol. 3(3), pages 270-278, April.
    4. Katsuhisa Ohno & Kuniyoshi Ichiki, 1987. "Computing Optimal Policies for Controlled Tandem Queueing Systems," Operations Research, INFORMS, vol. 35(1), pages 121-126, February.
    5. Vijay V. Desai & Vivek F. Farias & Ciamac C. Moallemi, 2012. "Approximate Dynamic Programming via a Smoothed Linear Program," Operations Research, INFORMS, vol. 60(3), pages 655-674, June.
    6. Tapas K. Das & Abhijit Gosavi & Sridhar Mahadevan & Nicholas Marchalleck, 1999. "Solving Semi-Markov Decision Problems Using Average Reward Reinforcement Learning," Management Science, INFORMS, vol. 45(4), pages 560-574, April.
    Full references (including those not matched with items on IDEAS)

    Citations

    Citations are extracted by the CitEc Project, subscribe to its RSS feed for this item.
    as


    Cited by:

    1. de Kok, Ton & Grob, Christopher & Laumanns, Marco & Minner, Stefan & Rambau, Jörg & Schade, Konrad, 2018. "A typology and literature review on stochastic multi-echelon inventory models," European Journal of Operational Research, Elsevier, vol. 269(3), pages 955-983.
    2. Annear, Luis Mauricio & Akhavan-Tabatabaei, Raha & Schmid, Verena, 2023. "Dynamic assignment of a multi-skilled workforce in job shops: An approximate dynamic programming approach," European Journal of Operational Research, Elsevier, vol. 306(3), pages 1109-1125.
    3. Cerqueti, Roy & Falbo, Paolo & Pelizzari, Cristian, 2017. "Relevant states and memory in Markov chain bootstrapping and simulation," European Journal of Operational Research, Elsevier, vol. 256(1), pages 163-177.
    4. Sankaranarayanan, Sriram & Feijoo, Felipe & Siddiqui, Sauleh, 2018. "Sensitivity and covariance in stochastic complementarity problems with an application to North American natural gas markets," European Journal of Operational Research, Elsevier, vol. 268(1), pages 25-36.
    5. Barlow, E. & Bedford, T. & Revie, M. & Tan, J. & Walls, L., 2021. "A performance-centred approach to optimising maintenance of complex systems," European Journal of Operational Research, Elsevier, vol. 292(2), pages 579-595.
    6. Cheng, Bayi & Leung, Joseph Y.-T. & Li, Kai & Yang, Shanlin, 2019. "Integrated optimization of material supplying, manufacturing, and product distribution: Models and fast algorithms," European Journal of Operational Research, Elsevier, vol. 277(1), pages 100-111.

    Most related items

    These are the items that most often cite the same works as this one and are cited by the same works as this one.
    1. Ohno, Katsuhisa, 2011. "The optimal control of just-in-time-based production and distribution systems and performance comparisons with optimized pull systems," European Journal of Operational Research, Elsevier, vol. 213(1), pages 124-133, August.
    2. Noordhoek, Marije & Dullaert, Wout & Lai, David S.W. & de Leeuw, Sander, 2018. "A simulation–optimization approach for a service-constrained multi-echelon distribution network," Transportation Research Part E: Logistics and Transportation Review, Elsevier, vol. 114(C), pages 292-311.
    3. Qu, Zhan & Raff, Horst & Schmitt, Nicolas, 2018. "Incentives through inventory control in supply chains," International Journal of Industrial Organization, Elsevier, vol. 59(C), pages 486-513.
    4. Voelkel, Michael A. & Sachs, Anna-Lena & Thonemann, Ulrich W., 2020. "An aggregation-based approximate dynamic programming approach for the periodic review model with random yield," European Journal of Operational Research, Elsevier, vol. 281(2), pages 286-298.
    5. Jan A. Van Mieghem & Nils Rudi, 2002. "Newsvendor Networks: Inventory Management and Capacity Investment with Discretionary Activities," Manufacturing & Service Operations Management, INFORMS, vol. 4(4), pages 313-335, August.
    6. Hill, R.M. & Seifbarghy, M. & Smith, D.K., 2007. "A two-echelon inventory model with lost sales," European Journal of Operational Research, Elsevier, vol. 181(2), pages 753-766, September.
    7. Tan, Madeleine Sui-Lay, 2016. "Policy coordination among the ASEAN-5: A global VAR analysis," Journal of Asian Economics, Elsevier, vol. 44(C), pages 20-40.
    8. Carole Camisullis & Vincent Giard, 2010. "Détermination des stocks de sécurité dans une chaîne logistique-amont dédiée à une production de masse de produits fortement diversifiés," Working Papers hal-00876986, HAL.
    9. Preil, Deniz & Krapp, Michael, 2022. "Bandit-based inventory optimisation: Reinforcement learning in multi-echelon supply chains," International Journal of Production Economics, Elsevier, vol. 252(C).
    10. D. W. K. Yeung, 2008. "Dynamically Consistent Solution For A Pollution Management Game In Collaborative Abatement With Uncertain Future Payoffs," International Game Theory Review (IGTR), World Scientific Publishing Co. Pte. Ltd., vol. 10(04), pages 517-538.
    11. Qu, Zhan & Raff, Horst & Schmitt, Nicolas, 2016. "A theory of intermediation in supply chains based on inventory control," CEPIE Working Papers 09/16, Technische Universität Dresden, Center of Public and International Economics (CEPIE).
    12. Hanafi, Said & Freville, Arnaud, 1998. "An efficient tabu search approach for the 0-1 multidimensional knapsack problem," European Journal of Operational Research, Elsevier, vol. 106(2-3), pages 659-675, April.
    13. van der Heijden, Matthieu, 2000. "Near cost-optimal inventory control policies for divergent networks under fill rate constraints," International Journal of Production Economics, Elsevier, vol. 63(2), pages 161-179, January.
    14. Sari, Kazim, 2010. "Exploring the impacts of radio frequency identification (RFID) technology on supply chain performance," European Journal of Operational Research, Elsevier, vol. 207(1), pages 174-183, November.
    15. Renato Cordeiro Amorim, 2016. "A Survey on Feature Weighting Based K-Means Algorithms," Journal of Classification, Springer;The Classification Society, vol. 33(2), pages 210-242, July.
    16. Dmitri Blueschke & Ivan Savin, 2015. "No such thing like perfect hammer: comparing different objective function specifications for optimal control," Jena Economics Research Papers 2015-005, Friedrich-Schiller-University Jena.
    17. Mustafa Doğru & A. Kok & G. Houtum, 2013. "Newsvendor characterizations for one-warehouse multi-retailer inventory systems with discrete demand under the balance assumption," Central European Journal of Operations Research, Springer;Slovak Society for Operations Research;Hungarian Operational Research Society;Czech Society for Operations Research;Österr. Gesellschaft für Operations Research (ÖGOR);Slovenian Society Informatika - Section for Operational Research;Croatian Operational Research Society, vol. 21(3), pages 541-559, September.
    18. Arts, Joachim & Kiesmüller, Gudrun P., 2013. "Analysis of a two-echelon inventory system with two supply modes," European Journal of Operational Research, Elsevier, vol. 225(2), pages 263-272.
    19. Changming Ji & Chuangang Li & Boquan Wang & Minghao Liu & Liping Wang, 2017. "Multi-Stage Dynamic Programming Method for Short-Term Cascade Reservoirs Optimal Operation with Flow Attenuation," Water Resources Management: An International Journal, Published for the European Water Resources Association (EWRA), Springer;European Water Resources Association (EWRA), vol. 31(14), pages 4571-4586, November.
    20. Ghassan, Hassan B. & Al-Jefri, Essam H., 2015. "الحساب الجاري في المدى البعيد عبر نموذج داخلي الزمن [The Current Account in the Long Run through the Intertemporal Model]," MPRA Paper 66527, University Library of Munich, Germany.

    Corrections

    All material on this site has been provided by the respective publishers and authors. You can help correct errors and omissions. When requesting a correction, please mention this item's handle: RePEc:eee:ejores:v:249:y:2016:i:1:p:22-31. See general information about how to correct material in RePEc.

    If you have authored this item and are not yet registered with RePEc, we encourage you to do it here. This allows to link your profile to this item. It also allows you to accept potential citations to this item that we are uncertain about.

    If CitEc recognized a bibliographic reference but did not link an item in RePEc to it, you can help with this form .

    If you know of missing items citing this one, you can help us creating those links by adding the relevant references in the same way as above, for each refering item. If you are a registered author of this item, you may also want to check the "citations" tab in your RePEc Author Service profile, as there may be some citations waiting for confirmation.

    For technical questions regarding this item, or to correct its authors, title, abstract, bibliographic or download information, contact: Catherine Liu (email available below). General contact details of provider: http://www.elsevier.com/locate/eor .

    Please note that corrections may take a couple of weeks to filter through the various RePEc services.

    IDEAS is a RePEc service. RePEc uses bibliographic data supplied by the respective publishers.