IDEAS home Printed from https://ideas.repec.org/a/eee/transe/v162y2022ics136655452200103x.html

Reinforcement learning for logistics and supply chain management: Methodologies, state of the art, and future opportunities

Author

Listed:
  • Yan, Yimo
  • Chow, Andy H.F.
  • Ho, Chin Pang
  • Kuo, Yong-Hong
  • Wu, Qihao
  • Ying, Chengshuo

Abstract

With advances in technologies, data science techniques, and computing equipment, there has been rapidly increasing interest in the applications of reinforcement learning (RL) to address the challenges resulting from the evolving business and organisational operations in logistics and supply chain management (SCM). This paper aims to provide a comprehensive review of the development and applications of RL techniques in the field of logistics and SCM. We first provide an introduction to RL methodologies, followed by a classification of previous research studies by application. The state-of-the-art research is reviewed and the current challenges are discussed. It is found that Q-learning (QL) is the most popular RL approach adopted by these studies and the research on RL for urban logistics is growing in recent years due to the prevalence of E-commerce and last mile delivery. Finally, some potential directions are presented for future research.

Suggested Citation

  • Yan, Yimo & Chow, Andy H.F. & Ho, Chin Pang & Kuo, Yong-Hong & Wu, Qihao & Ying, Chengshuo, 2022. "Reinforcement learning for logistics and supply chain management: Methodologies, state of the art, and future opportunities," Transportation Research Part E: Logistics and Transportation Review, Elsevier, vol. 162(C).
  • Handle: RePEc:eee:transe:v:162:y:2022:i:c:s136655452200103x
    DOI: 10.1016/j.tre.2022.102712
    as

    Download full text from publisher

    File URL: http://www.sciencedirect.com/science/article/pii/S136655452200103X
    Download Restriction: Full text for ScienceDirect subscribers only

    File URL: https://libkey.io/10.1016/j.tre.2022.102712?utm_source=ideas
    LibKey link: if access is restricted and if your library uses this service, LibKey will redirect you to where you can use your library subscription to access this item
    ---><---

    As the access to this document is restricted, you may want to

    for a different version of it.

    References listed on IDEAS

    as
    1. Cheung, Kam-Fung & Bell, Michael G.H. & Bhattacharjya, Jyotirmoyee, 2021. "Cybersecurity in logistics and supply chain management: An overview and future research directions," Transportation Research Part E: Logistics and Transportation Review, Elsevier, vol. 146(C).
    2. Yong-Hong Kuo & Andrew Kusiak, 2019. "From data to big data in production research: the past and future trends," International Journal of Production Research, Taylor & Francis Journals, vol. 57(15-16), pages 4828-4853, August.
    3. Lafkihi, Mariam & Pan, Shenle & Ballot, Eric, 2019. "Freight transportation service procurement: A literature review and future research opportunities in omnichannel E-commerce," Transportation Research Part E: Logistics and Transportation Review, Elsevier, vol. 125(C), pages 348-365.
    4. Asadi, Amin & Nurre Pinkley, Sarah, 2021. "A stochastic scheduling, allocation, and inventory replenishment problem for battery swap stations," Transportation Research Part E: Logistics and Transportation Review, Elsevier, vol. 146(C).
    5. Wang, Xin & Kuo, Yong-Hong & Shen, Houcai & Zhang, Lianmin, 2021. "Target-oriented robust location–transportation problem with service-level measure," Transportation Research Part B: Methodological, Elsevier, vol. 153(C), pages 1-20.
    6. Rana, Rupal & Oliveira, Fernando S., 2014. "Real-time dynamic pricing in a non-stationary environment using model-free reinforcement learning," Omega, Elsevier, vol. 47(C), pages 116-126.
    7. Li, Xueping & Wang, Jiao & Sawhney, Rapinder, 2012. "Reinforcement learning for joint pricing, lead-time and scheduling decisions in make-to-order systems," European Journal of Operational Research, Elsevier, vol. 221(1), pages 99-109.
    8. Martin, Simon & Ouelhadj, Djamila & Beullens, Patrick & Ozcan, Ender & Juan, Angel A. & Burke, Edmund K., 2016. "A multi-agent based cooperative approach to scheduling and routing," European Journal of Operational Research, Elsevier, vol. 254(1), pages 169-178.
    9. Firdausiyah, N. & Taniguchi, E. & Qureshi, A.G., 2019. "Modeling city logistics using adaptive dynamic programming based multi-agent simulation," Transportation Research Part E: Logistics and Transportation Review, Elsevier, vol. 125(C), pages 74-96.
    10. Wolfram Wiesemann & Daniel Kuhn & Berç Rustem, 2013. "Robust Markov Decision Processes," Mathematics of Operations Research, INFORMS, vol. 38(1), pages 153-183, February.
    11. Hau L. Lee & V. Padmanabhan & Seungjin Whang, 1997. "Information Distortion in a Supply Chain: The Bullwhip Effect," Management Science, INFORMS, vol. 43(4), pages 546-558, April.
    12. Amir Ardestani-Jaafari & Erick Delage, 2018. "The Value of Flexibility in Robust Location–Transportation Problems," Transportation Science, INFORMS, vol. 52(1), pages 189-209, January.
    13. Giannoccaro, Ilaria & Pontrandolfo, Pierpaolo, 2002. "Inventory management in supply chains: a reinforcement learning approach," International Journal of Production Economics, Elsevier, vol. 78(2), pages 153-161, July.
    14. Martin, Layla & Minner, Stefan, 2021. "Feature-based selection of carsharing relocation modes," Transportation Research Part E: Logistics and Transportation Review, Elsevier, vol. 149(C).
    15. Dimitris Bertsimas & Aurélie Thiele, 2006. "A Robust Optimization Approach to Inventory Theory," Operations Research, INFORMS, vol. 54(1), pages 150-168, February.
    16. Choi, Tsan-Ming, 2020. "Internet based elastic logistics platforms for fashion quick response systems in the digital era," Transportation Research Part E: Logistics and Transportation Review, Elsevier, vol. 143(C).
    17. Mitręga, Maciej & Choi, Tsan-Ming, 2021. "How small-and-medium transportation companies handle asymmetric customer relationships under COVID-19 pandemic: A multi-method study," Transportation Research Part E: Logistics and Transportation Review, Elsevier, vol. 148(C).
    18. Yin, Jiateng & Tang, Tao & Yang, Lixing & Gao, Ziyou & Ran, Bin, 2016. "Energy-efficient metro train rescheduling with uncertain time-variant passenger demands: An approximate dynamic programming approach," Transportation Research Part B: Methodological, Elsevier, vol. 91(C), pages 178-210.
    19. Illhoe Hwang & Young Jae Jang, 2020. "Q(λ) learning-based dynamic route guidance algorithm for overhead hoist transport systems in semiconductor fabs," International Journal of Production Research, Taylor & Francis Journals, vol. 58(4), pages 1199-1221, February.
    20. Chen-Fu Chien & Yun-Siang Lin & Sheng-Kai Lin, 2020. "Deep reinforcement learning for selecting demand forecast models to empower Industry 3.5 and an empirical study for a semiconductor component distributor," International Journal of Production Research, Taylor & Francis Journals, vol. 58(9), pages 2784-2804, May.
    21. Shenle Pan & Damien Trentesaux & Duncan Mcfarlane & Benoit Montreuil & Eric Ballot & George Huang, 2021. "Digital interoperability in logistics and supply chain management: state-of-the-art and research avenues towards Physical Internet," Post-Print hal-03161524, HAL.
    22. Cleophas, Catherine & Cottrill, Caitlin & Ehmke, Jan Fabian & Tierney, Kevin, 2019. "Collaborative urban transportation: Recent advances in theory and practice," European Journal of Operational Research, Elsevier, vol. 273(3), pages 801-816.
    23. Mariam Lafkihi & Shenle Pan & Eric Ballot, 2019. "Freight transportation service procurement: A literature review and future research opportunities in omnichannel E-commerce," Post-Print hal-02086154, HAL.
    24. Ying, Cheng-shuo & Chow, Andy H.F. & Chin, Kwai-Sang, 2020. "An actor-critic deep reinforcement learning approach for metro train scheduling with rolling stock circulation under stochastic demand," Transportation Research Part B: Methodological, Elsevier, vol. 140(C), pages 210-235.
    25. Byeongseop Kim & Yongkuk Jeong & Jong Gye Shin, 2020. "Spatial arrangement using deep reinforcement learning to minimise rearrangement in ship block stockyards," International Journal of Production Research, Taylor & Francis Journals, vol. 58(16), pages 5062-5076, July.
    26. Kim, Kap Hwan & Lee, Keung Mo & Hwang, Hark, 2003. "Sequencing delivery and receiving operations for yard cranes in port container terminals," International Journal of Production Economics, Elsevier, vol. 84(3), pages 283-292, June.
    27. Ahamed, Tanvir & Zou, Bo & Farazi, Nahid Parvez & Tulabandhula, Theja, 2021. "Deep Reinforcement Learning for Crowdsourced Urban Delivery," Transportation Research Part B: Methodological, Elsevier, vol. 152(C), pages 227-257.
    28. Bruzzone, Francesco & Cavallaro, Federico & Nocera, Silvio, 2021. "The integration of passenger and freight transport for first-last mile operations," Transport Policy, Elsevier, vol. 100(C), pages 31-48.
    29. Galindo, Gina & Batta, Rajan, 2013. "Review of recent developments in OR/MS research in disaster operations management," European Journal of Operational Research, Elsevier, vol. 230(2), pages 201-211.
    30. Al Hajj Hassan, Lama & Mahmassani, Hani S. & Chen, Ying, 2020. "Reinforcement learning framework for freight demand forecasting to support operational planning decisions," Transportation Research Part E: Logistics and Transportation Review, Elsevier, vol. 137(C).
    31. Fotuhi, Fateme & Huynh, Nathan & Vidal, Jose M. & Xie, Yuanchang, 2013. "Modeling yard crane operators as reinforcement learning agents," Research in Transportation Economics, Elsevier, vol. 42(1), pages 3-12.
    32. Volodymyr Mnih & Koray Kavukcuoglu & David Silver & Andrei A. Rusu & Joel Veness & Marc G. Bellemare & Alex Graves & Martin Riedmiller & Andreas K. Fidjeland & Georg Ostrovski & Stig Petersen & Charle, 2015. "Human-level control through deep reinforcement learning," Nature, Nature, vol. 518(7540), pages 529-533, February.
    33. Shenle Pan & Damien Trentesaux & Duncan Mcfarlane & Benoit Montreuil & Eric Ballot & George Huang, 2021. "Digital interoperability and transformation in logistics and supply chain management: Editorial," Post-Print hal-03195695, HAL.
    34. Kyuree Ahn & Jinkyoo Park, 2021. "Cooperative zone-based rebalancing of idle overhead hoist transportations using multi-agent reinforcement learning with graph representation learning," IISE Transactions, Taylor & Francis Journals, vol. 53(10), pages 1140-1156, October.
    35. Nie, Yu (Marco) & Wu, Xing, 2009. "Shortest path problem considering on-time arrival probability," Transportation Research Part B: Methodological, Elsevier, vol. 43(6), pages 597-613, July.
    36. Enayati, Shakiba & Özaltın, Osman Y., 2020. "Optimal influenza vaccine distribution with equity," European Journal of Operational Research, Elsevier, vol. 283(2), pages 714-725.
    37. Liu, Shan & Jiang, Hai & Chen, Shuiping & Ye, Jing & He, Renqing & Sun, Zhizhao, 2020. "Integrating Dijkstra’s algorithm into deep inverse reinforcement learning for food delivery route planning," Transportation Research Part E: Logistics and Transportation Review, Elsevier, vol. 142(C).
    38. Chiang, Chi, 2003. "Optimal replenishment for a periodic review inventory system with two supply modes," European Journal of Operational Research, Elsevier, vol. 149(1), pages 229-244, August.
    39. Arnab Nilim & Laurent El Ghaoui, 2005. "Robust Control of Markov Decision Processes with Uncertain Transition Matrices," Operations Research, INFORMS, vol. 53(5), pages 780-798, October.
    40. Rameshwar Dubey & Angappa Gunasekaran & Thanos Papadopoulos, 2019. "Disaster relief operations: past, present and future," Annals of Operations Research, Springer, vol. 283(1), pages 1-8, December.
    Full references (including those not matched with items on IDEAS)

    Most related items

    These are the items that most often cite the same works as this one and are cited by the same works as this one.
    1. Xiaoyan Xu & Suresh P. Sethi & Sai‐Ho Chung & Tsan‐Ming Choi, 2023. "Reforming global supply chain management under pandemics: The GREAT‐3Rs framework," Production and Operations Management, Production and Operations Management Society, vol. 32(2), pages 524-546, February.
    2. Kong, Xiang T.R. & Kang, Kai & Zhong, Ray Y. & Luo, Hao & Xu, Su Xiu, 2021. "Cyber physical system-enabled on-demand logistics trading," International Journal of Production Economics, Elsevier, vol. 233(C).
    3. Sun, Xuting & Kuo, Yong-Hong & Xue, Weili & Li, Yanzhi, 2024. "Technology-driven logistics and supply chain management for societal impacts," Transportation Research Part E: Logistics and Transportation Review, Elsevier, vol. 185(C).
    4. Wang, Xuekai & D’Ariano, Andrea & Su, Shuai & Tang, Tao, 2023. "Cooperative train control during the power supply shortage in metro system: A multi-agent reinforcement learning approach," Transportation Research Part B: Methodological, Elsevier, vol. 170(C), pages 244-278.
    5. Qi, Mingyao & Yang, Ying & Cheng, Chun, 2023. "Location and inventory pre-positioning problem under uncertainty," Transportation Research Part E: Logistics and Transportation Review, Elsevier, vol. 177(C).
    6. Arthur Flajolet & Sébastien Blandin & Patrick Jaillet, 2018. "Robust Adaptive Routing Under Uncertainty," Operations Research, INFORMS, vol. 66(1), pages 210-229, January.
    7. Li, Wenqing & Ni, Shaoquan, 2022. "Train timetabling with the general learning environment and multi-agent deep reinforcement learning," Transportation Research Part B: Methodological, Elsevier, vol. 157(C), pages 230-251.
    8. Xin, Linwei & Goldberg, David A., 2021. "Time (in)consistency of multistage distributionally robust inventory models with moment constraints," European Journal of Operational Research, Elsevier, vol. 289(3), pages 1127-1141.
    9. Xu, Su Xiu & Zhao, Zhiheng & Huang, George Q. & Ding, Yifang & Li, Ming & Feng, Jianghong, 2025. "A meta-auction for on-demand transportation procurement in industry 5.0," Transportation Research Part E: Logistics and Transportation Review, Elsevier, vol. 193(C).
    10. Wang, Haibo & Alidaee, Bahram, 2023. "White-glove service delivery: A quantitative analysis," Transportation Research Part E: Logistics and Transportation Review, Elsevier, vol. 175(C).
    11. Guo, Chaojie & Thompson, Russell G. & Foliente, Greg & Kong, Xiang T.R., 2021. "An auction-enabled collaborative routing mechanism for omnichannel on-demand logistics through transshipment," Transportation Research Part E: Logistics and Transportation Review, Elsevier, vol. 146(C).
    12. Martin Savelsbergh & Marlin W. Ulmer, 2024. "Challenges and opportunities in crowdsourced delivery planning and operations—an update," Annals of Operations Research, Springer, vol. 343(2), pages 639-661, December.
    13. Greb, Maximilian & Butollo, Florian, 2025. "Digitale Technik und Resilienz: Möglichkeiten und Grenzen der digitalen Stabilisierung von Lieferketten," Discussion Papers, Research Group Globalization, Work, and Production SP III 2025-302, WZB Berlin Social Science Center.
    14. Martin W.P Savelsbergh & Marlin W. Ulmer, 2022. "Challenges and opportunities in crowdsourced delivery planning and operations," 4OR, Springer, vol. 20(1), pages 1-21, March.
    15. Wu, Hang & Li, Ming & Yu, Chenglin & Ouyang, Zhiyuan & Lai, Kee-hung & Zhao, Zhiheng & Pan, Shenle & Wang, Shuaian & Zhong, Ray Y. & Kuo, Yong-Hong & Zhang, Fangni & Huang, Wenjie & Shen, Zuo-Jun Max , 2025. "Towards cyber-physical internet: A systematic review, fundamental model and future perspectives," Transportation Research Part E: Logistics and Transportation Review, Elsevier, vol. 197(C).
    16. Parvez Farazi, Nahid & Zou, Bo & Tulabandhula, Theja, 2022. "Dynamic On-Demand Crowdshipping Using Constrained and Heuristics-Embedded Double Dueling Deep Q-Network," Transportation Research Part E: Logistics and Transportation Review, Elsevier, vol. 166(C).
    17. Maximilian Blesch & Philipp Eisenhauer, 2021. "Robust decision-making under risk and ambiguity," Papers 2104.12573, arXiv.org, revised Oct 2021.
    18. Dellbrügge, Marius & Brilka, Tim & Kreuz, Felix & Clausen, Uwe, 2022. "Auction design in strategic freight procurement," Chapters from the Proceedings of the Hamburg International Conference of Logistics (HICL), in: Kersten, Wolfgang & Jahn, Carlos & Blecker, Thorsten & Ringle, Christian M. (ed.), Changing Tides: The New Role of Resilience and Sustainability in Logistics and Supply Chain Management – Innovative Approaches for the Shift to a New , volume 33, pages 295-325, Hamburg University of Technology (TUHH), Institute of Business Logistics and General Management.
    19. Boute, Robert N. & Gijsbrechts, Joren & van Jaarsveld, Willem & Vanvuchelen, Nathalie, 2022. "Deep reinforcement learning for inventory control: A roadmap," European Journal of Operational Research, Elsevier, vol. 298(2), pages 401-412.
    20. Fink, Alexander A. & Klöckner, Maximilian & Räder, Tobias & Wagner, Stephan M., 2022. "Supply chain management accelerators: Types, objectives, and key design features," Transportation Research Part E: Logistics and Transportation Review, Elsevier, vol. 164(C).

    More about this item

    Keywords

    ;
    ;
    ;
    ;
    ;
    ;
    ;

    Statistics

    Access and download statistics

    Corrections

    All material on this site has been provided by the respective publishers and authors. You can help correct errors and omissions. When requesting a correction, please mention this item's handle: RePEc:eee:transe:v:162:y:2022:i:c:s136655452200103x. See general information about how to correct material in RePEc.

    If you have authored this item and are not yet registered with RePEc, we encourage you to do it here. This allows to link your profile to this item. It also allows you to accept potential citations to this item that we are uncertain about.

    If CitEc recognized a bibliographic reference but did not link an item in RePEc to it, you can help with this form .

    If you know of missing items citing this one, you can help us creating those links by adding the relevant references in the same way as above, for each refering item. If you are a registered author of this item, you may also want to check the "citations" tab in your RePEc Author Service profile, as there may be some citations waiting for confirmation.

    For technical questions regarding this item, or to correct its authors, title, abstract, bibliographic or download information, contact: Catherine Liu (email available below). General contact details of provider: http://www.elsevier.com/wps/find/journaldescription.cws_home/600244/description#description .

    Please note that corrections may take a couple of weeks to filter through the various RePEc services.

    IDEAS is a RePEc service. RePEc uses bibliographic data supplied by the respective publishers.