A multi-agent deep reinforcement learning approach for multi-echelon inventory optimization and its application to the beer game
Author
Abstract
Suggested Citation
DOI: 10.1016/j.tre.2025.104367
Download full text from publisher
As the access to this document is restricted, you may want to
for a different version of it.References listed on IDEAS
- Kevin Geevers & Lotte Hezewijk & Martijn R. K. Mes, 2024. "Multi-echelon inventory optimization using deep reinforcement learning," Central European Journal of Operations Research, Springer;Slovak Society for Operations Research;Hungarian Operational Research Society;Czech Society for Operations Research;Österr. Gesellschaft für Operations Research (ÖGOR);Slovenian Society Informatika - Section for Operational Research;Croatian Operational Research Society, vol. 32(3), pages 653-683, September.
- Lee, Junhyeok & Shin, Youngchul & Moon, Ilkyeong, 2024. "A hybrid deep reinforcement learning approach for a proactive transshipment of fresh food in the online–offline channel system," Transportation Research Part E: Logistics and Transportation Review, Elsevier, vol. 187(C).
- Andrew J. Clark & Herbert Scarf, 2004.
"Optimal Policies for a Multi-Echelon Inventory Problem,"
Management Science, INFORMS, vol. 50(12_supple), pages 1782-1790, December.
- Andrew J. Clark & Herbert Scarf, 1960. "Optimal Policies for a Multi-Echelon Inventory Problem," Management Science, INFORMS, vol. 6(4), pages 475-490, July.
- Stranieri, Francesco & Fadda, Edoardo & Stella, Fabio, 2024. "Combining deep reinforcement learning and multi-stage stochastic programming to address the supply chain inventory management problem," International Journal of Production Economics, Elsevier, vol. 268(C).
- Fangruo Chen & Yu-Sheng Zheng, 1994. "Lower Bounds for Multi-Echelon Stochastic Inventory Systems," Management Science, INFORMS, vol. 40(11), pages 1426-1443, November.
- Afshin Oroojlooyjadid & MohammadReza Nazari & Lawrence V. Snyder & Martin Takáč, 2022. "A Deep Q-Network for the Beer Game: Deep Reinforcement Learning for Inventory Optimization," Manufacturing & Service Operations Management, INFORMS, vol. 24(1), pages 285-304, January.
- de Kok, Ton & Grob, Christopher & Laumanns, Marco & Minner, Stefan & Rambau, Jörg & Schade, Konrad, 2018. "A typology and literature review on stochastic multi-echelon inventory models," European Journal of Operational Research, Elsevier, vol. 269(3), pages 955-983.
- Kenneth F. Simpson, 1958. "In-Process Inventories," Operations Research, INFORMS, vol. 6(6), pages 863-873, December.
- Stephen C. Graves, 1985. "A Multi-Echelon Inventory Model for a Repairable Item with One-for-One Replenishment," Management Science, INFORMS, vol. 31(10), pages 1247-1256, October.
- Hau L. Lee & V. Padmanabhan & Seungjin Whang, 1997. "Information Distortion in a Supply Chain: The Bullwhip Effect," Management Science, INFORMS, vol. 43(4), pages 546-558, April.
- Fangruo Chen & Yu-Sheng Zheng, 1998. "Near-Optimal Echelon-Stock (R, nQ) Policies in Multistage Serial Systems," Operations Research, INFORMS, vol. 46(4), pages 592-602, August.
- Dony S. Kurian & V. Madhusudanan Pillai & J. Gautham & Akash Raut, 2023. "Data-driven imitation learning-based approach for order size determination in supply chains," European Journal of Industrial Engineering, Inderscience Enterprises Ltd, vol. 17(3), pages 379-407.
- Yang Deng & Andy H. F. Chow & Yimo Yan & Zicheng Su & Zhili Zhou & Yong-Hong Kuo, 2025. "Hierarchical production control and distribution planning under retail uncertainty with reinforcement learning," International Journal of Production Research, Taylor & Francis Journals, vol. 63(12), pages 4504-4522, June.
- Guillermo Gallego & Paul Zipkin, 1999. "Stock Positioning and Performance Estimation in Serial Production-Transportation Systems," Manufacturing & Service Operations Management, INFORMS, vol. 1(1), pages 77-88.
- Yan, Yimo & Chow, Andy H.F. & Ho, Chin Pang & Kuo, Yong-Hong & Wu, Qihao & Ying, Chengshuo, 2022. "Reinforcement learning for logistics and supply chain management: Methodologies, state of the art, and future opportunities," Transportation Research Part E: Logistics and Transportation Review, Elsevier, vol. 162(C).
- Lee, Hyun-Rok & Lee, Taesik, 2021. "Multi-agent reinforcement learning algorithm to solve a partially-observable multi-agent problem in disaster response," European Journal of Operational Research, Elsevier, vol. 291(1), pages 296-308.
- John D. Sterman, 1989. "Modeling Managerial Behavior: Misperceptions of Feedback in a Dynamic Decision Making Experiment," Management Science, INFORMS, vol. 35(3), pages 321-339, March.
- Boute, Robert N. & Gijsbrechts, Joren & van Jaarsveld, Willem & Vanvuchelen, Nathalie, 2022. "Deep reinforcement learning for inventory control: A roadmap," European Journal of Operational Research, Elsevier, vol. 298(2), pages 401-412.
- Joren Gijsbrechts & Robert N. Boute & Jan A. Van Mieghem & Dennis J. Zhang, 2022. "Can Deep Reinforcement Learning Improve Inventory Management? Performance on Lost Sales, Dual-Sourcing, and Multi-Echelon Problems," Manufacturing & Service Operations Management, INFORMS, vol. 24(3), pages 1349-1368, May.
- Paul Glasserman & Sridhar Tayur, 1995. "Sensitivity Analysis for Base-Stock Levels in Multiechelon Production-Inventory Systems," Management Science, INFORMS, vol. 41(2), pages 263-281, February.
- Rachel Croson & Karen Donohue, 2006. "Behavioral Causes of the Bullwhip Effect and the Observed Value of Inventory Information," Management Science, INFORMS, vol. 52(3), pages 323-336, March.
- Kaj Rosling, 1989. "Optimal Inventory Policies for Assembly Systems Under Random Demands," Operations Research, INFORMS, vol. 37(4), pages 565-579, August.
- Daniel S. Bernstein & Robert Givan & Neil Immerman & Shlomo Zilberstein, 2002. "The Complexity of Decentralized Control of Markov Decision Processes," Mathematics of Operations Research, INFORMS, vol. 27(4), pages 819-840, November.
- Francesco Stranieri & Fabio Stella & Chaaben Kouki, 2024. "Performance of deep reinforcement learning algorithms in two-echelon inventory control systems," International Journal of Production Research, Taylor & Francis Journals, vol. 62(17), pages 6211-6226, September.
- Manupati, Vijaya Kumar & Schoenherr, Tobias & Subramanian, Nachiappan & Ramkumar, M. & Soni, Bhanushree & Panigrahi, Suraj, 2021. "A multi-echelon dynamic cold chain for managing vaccine distribution," Transportation Research Part E: Logistics and Transportation Review, Elsevier, vol. 156(C).
Most related items
These are the items that most often cite the same works as this one and are cited by the same works as this one.- Fleuren, Tijn, 2025. "Stochastic approaches for production-inventory planning : Applications to high-tech supply chains," Other publications TiSEM 1fe1bbe5-fd90-4077-8606-d, Tilburg University, School of Economics and Management.
- de Kok, Ton & Grob, Christopher & Laumanns, Marco & Minner, Stefan & Rambau, Jörg & Schade, Konrad, 2018. "A typology and literature review on stochastic multi-echelon inventory models," European Journal of Operational Research, Elsevier, vol. 269(3), pages 955-983.
- Bergsma, Ritsaart & de Ruijt, Corné & Bhulai, Sandjai, 2025. "A systematic review of machine learning approaches in inventory control optimization," Operations Research Perspectives, Elsevier, vol. 15(C).
- Barnes-Schuster, Dawn & Bassok, Yehuda & Anupindi, Ravi, 2006. "Optimizing delivery lead time/inventory placement in a two-stage production/distribution system," European Journal of Operational Research, Elsevier, vol. 174(3), pages 1664-1684, November.
- Vanteddu, Gangaraju & Chinnam, Ratna Babu & Gushikin, Oleg, 2011. "Supply chain focus dependent supplier selection problem," International Journal of Production Economics, Elsevier, vol. 129(1), pages 204-216, January.
- Lingxiu Dong & Hau L. Lee, 2003. "Optimal Policies and Approximations for a Serial Multiechelon Inventory System with Time-Correlated Demand," Operations Research, INFORMS, vol. 51(6), pages 969-980, December.
- Tan Wang & L. Jeff Hong, 2023. "Large-Scale Inventory Optimization: A Recurrent Neural Networks–Inspired Simulation Approach," INFORMS Journal on Computing, INFORMS, vol. 35(1), pages 196-215, January.
- Noel Watson & Yu-Sheng Zheng, 2005. "Decentralized Serial Supply Chains Subject to Order Delays and Information Distortion: Exploiting Real-Time Sales Data," Manufacturing & Service Operations Management, INFORMS, vol. 7(2), pages 152-168, May.
- Li, Xiuhui & Wang, Qinan, 2007. "Coordination mechanisms of supply chain systems," European Journal of Operational Research, Elsevier, vol. 179(1), pages 1-16, May.
- Cui, Geng & Imura, Naoto & Nishinari, Katsuhiro & Ezaki, Takahiro, 2025. "On order smoothing interpolating the order-up-to and constant order policies," Omega, Elsevier, vol. 136(C).
- Fleuren, Tijn & Merzifonluoglu, Yasemin & Sotirov, Renata & Hendriks, Maarten, 2025. "Production–inventory planning in high-tech low-volume manufacturing supply chains," International Journal of Production Economics, Elsevier, vol. 288(C).
- van Houtum, G. J. & Inderfurth, K. & Zijm, W. H. M., 1996. "Materials coordination in stochastic multi-echelon systems," European Journal of Operational Research, Elsevier, vol. 95(1), pages 1-23, November.
- Guillermo Gallego & Özalp Özer & Paul Zipkin, 2007. "Bounds, Heuristics, and Approximations for Distribution Systems," Operations Research, INFORMS, vol. 55(3), pages 503-517, June.
- Guillermo Gallego & Paul Zipkin, 1999. "Stock Positioning and Performance Estimation in Serial Production-Transportation Systems," Manufacturing & Service Operations Management, INFORMS, vol. 1(1), pages 77-88.
- Rodney P. Parker & Roman Kapuscinski, 2004. "Optimal Policies for a Capacitated Two-Echelon Inventory System," Operations Research, INFORMS, vol. 52(5), pages 739-755, October.
- Fangruo Chen, 2000. "Optimal Policies for Multi-Echelon Inventory Problems with Batch Ordering," Operations Research, INFORMS, vol. 48(3), pages 376-389, June.
- David Simchi-Levi & Yao Zhao, 2005. "Safety Stock Positioning in Supply Chains with Stochastic Lead Times," Manufacturing & Service Operations Management, INFORMS, vol. 7(4), pages 295-318, December.
- Ming Hu & Yi Yang, 2014. "Modified Echelon ( r, Q ) Policies with Guaranteed Performance Bounds for Stochastic Serial Inventory Systems," Operations Research, INFORMS, vol. 62(4), pages 812-828, August.
- Alp Muharremoglu & John N. Tsitsiklis, 2008. "A Single-Unit Decomposition Approach to Multiechelon Inventory Systems," Operations Research, INFORMS, vol. 56(5), pages 1089-1103, October.
- Fangruo Chen & Rungson Samroengraja, 2000. "A Staggered Ordering Policy for One-Warehouse, Multiretailer Systems," Operations Research, INFORMS, vol. 48(2), pages 281-293, April.
Corrections
All material on this site has been provided by the respective publishers and authors. You can help correct errors and omissions. When requesting a correction, please mention this item's handle: RePEc:eee:transe:v:203:y:2025:i:c:s1366554525004089. See general information about how to correct material in RePEc.
If you have authored this item and are not yet registered with RePEc, we encourage you to do it here. This allows to link your profile to this item. It also allows you to accept potential citations to this item that we are uncertain about.
If CitEc recognized a bibliographic reference but did not link an item in RePEc to it, you can help with this form .
If you know of missing items citing this one, you can help us creating those links by adding the relevant references in the same way as above, for each refering item. If you are a registered author of this item, you may also want to check the "citations" tab in your RePEc Author Service profile, as there may be some citations waiting for confirmation.
For technical questions regarding this item, or to correct its authors, title, abstract, bibliographic or download information, contact: Catherine Liu (email available below). General contact details of provider: http://www.elsevier.com/wps/find/journaldescription.cws_home/600244/description#description .
Please note that corrections may take a couple of weeks to filter through the various RePEc services.
Printed from https://ideas.repec.org/a/eee/transe/v203y2025ics1366554525004089.html