IDEAS home Printed from https://ideas.repec.org/a/eee/transe/v162y2022ics136655452200103x.html
   My bibliography  Save this article

Reinforcement learning for logistics and supply chain management: Methodologies, state of the art, and future opportunities

Author

Listed:
  • Yan, Yimo
  • Chow, Andy H.F.
  • Ho, Chin Pang
  • Kuo, Yong-Hong
  • Wu, Qihao
  • Ying, Chengshuo

Abstract

With advances in technologies, data science techniques, and computing equipment, there has been rapidly increasing interest in the applications of reinforcement learning (RL) to address the challenges resulting from the evolving business and organisational operations in logistics and supply chain management (SCM). This paper aims to provide a comprehensive review of the development and applications of RL techniques in the field of logistics and SCM. We first provide an introduction to RL methodologies, followed by a classification of previous research studies by application. The state-of-the-art research is reviewed and the current challenges are discussed. It is found that Q-learning (QL) is the most popular RL approach adopted by these studies and the research on RL for urban logistics is growing in recent years due to the prevalence of E-commerce and last mile delivery. Finally, some potential directions are presented for future research.

Suggested Citation

  • Yan, Yimo & Chow, Andy H.F. & Ho, Chin Pang & Kuo, Yong-Hong & Wu, Qihao & Ying, Chengshuo, 2022. "Reinforcement learning for logistics and supply chain management: Methodologies, state of the art, and future opportunities," Transportation Research Part E: Logistics and Transportation Review, Elsevier, vol. 162(C).
  • Handle: RePEc:eee:transe:v:162:y:2022:i:c:s136655452200103x
    DOI: 10.1016/j.tre.2022.102712
    as

    Download full text from publisher

    File URL: http://www.sciencedirect.com/science/article/pii/S136655452200103X
    Download Restriction: Full text for ScienceDirect subscribers only

    File URL: https://libkey.io/10.1016/j.tre.2022.102712?utm_source=ideas
    LibKey link: if access is restricted and if your library uses this service, LibKey will redirect you to where you can use your library subscription to access this item
    ---><---

    As the access to this document is restricted, you may want to search for a different version of it.

    References listed on IDEAS

    as
    1. Yong-Hong Kuo & Andrew Kusiak, 2019. "From data to big data in production research: the past and future trends," International Journal of Production Research, Taylor & Francis Journals, vol. 57(15-16), pages 4828-4853, August.
    2. Lafkihi, Mariam & Pan, Shenle & Ballot, Eric, 2019. "Freight transportation service procurement: A literature review and future research opportunities in omnichannel E-commerce," Transportation Research Part E: Logistics and Transportation Review, Elsevier, vol. 125(C), pages 348-365.
    3. Wang, Xin & Kuo, Yong-Hong & Shen, Houcai & Zhang, Lianmin, 2021. "Target-oriented robust location–transportation problem with service-level measure," Transportation Research Part B: Methodological, Elsevier, vol. 153(C), pages 1-20.
    4. Rana, Rupal & Oliveira, Fernando S., 2014. "Real-time dynamic pricing in a non-stationary environment using model-free reinforcement learning," Omega, Elsevier, vol. 47(C), pages 116-126.
    5. Firdausiyah, N. & Taniguchi, E. & Qureshi, A.G., 2019. "Modeling city logistics using adaptive dynamic programming based multi-agent simulation," Transportation Research Part E: Logistics and Transportation Review, Elsevier, vol. 125(C), pages 74-96.
    6. Amir Ardestani-Jaafari & Erick Delage, 2018. "The Value of Flexibility in Robust Location–Transportation Problems," Transportation Science, INFORMS, vol. 52(1), pages 189-209, January.
    7. Martin, Layla & Minner, Stefan, 2021. "Feature-based selection of carsharing relocation modes," Transportation Research Part E: Logistics and Transportation Review, Elsevier, vol. 149(C).
    8. Dimitris Bertsimas & Aurélie Thiele, 2006. "A Robust Optimization Approach to Inventory Theory," Operations Research, INFORMS, vol. 54(1), pages 150-168, February.
    9. Mitręga, Maciej & Choi, Tsan-Ming, 2021. "How small-and-medium transportation companies handle asymmetric customer relationships under COVID-19 pandemic: A multi-method study," Transportation Research Part E: Logistics and Transportation Review, Elsevier, vol. 148(C).
    10. Shenle Pan & Damien Trentesaux & Duncan Mcfarlane & Benoit Montreuil & Eric Ballot & George Huang, 2021. "Digital interoperability in logistics and supply chain management: state-of-the-art and research avenues towards Physical Internet," Post-Print hal-03161524, HAL.
    11. Cleophas, Catherine & Cottrill, Caitlin & Ehmke, Jan Fabian & Tierney, Kevin, 2019. "Collaborative urban transportation: Recent advances in theory and practice," European Journal of Operational Research, Elsevier, vol. 273(3), pages 801-816.
    12. Mariam Lafkihi & Shenle Pan & Eric Ballot, 2019. "Freight transportation service procurement: A literature review and future research opportunities in omnichannel E-commerce," Post-Print hal-02086154, HAL.
    13. Ying, Cheng-shuo & Chow, Andy H.F. & Chin, Kwai-Sang, 2020. "An actor-critic deep reinforcement learning approach for metro train scheduling with rolling stock circulation under stochastic demand," Transportation Research Part B: Methodological, Elsevier, vol. 140(C), pages 210-235.
    14. Byeongseop Kim & Yongkuk Jeong & Jong Gye Shin, 2020. "Spatial arrangement using deep reinforcement learning to minimise rearrangement in ship block stockyards," International Journal of Production Research, Taylor & Francis Journals, vol. 58(16), pages 5062-5076, July.
    15. Kim, Kap Hwan & Lee, Keung Mo & Hwang, Hark, 2003. "Sequencing delivery and receiving operations for yard cranes in port container terminals," International Journal of Production Economics, Elsevier, vol. 84(3), pages 283-292, June.
    16. Al Hajj Hassan, Lama & Mahmassani, Hani S. & Chen, Ying, 2020. "Reinforcement learning framework for freight demand forecasting to support operational planning decisions," Transportation Research Part E: Logistics and Transportation Review, Elsevier, vol. 137(C).
    17. Fotuhi, Fateme & Huynh, Nathan & Vidal, Jose M. & Xie, Yuanchang, 2013. "Modeling yard crane operators as reinforcement learning agents," Research in Transportation Economics, Elsevier, vol. 42(1), pages 3-12.
    18. Volodymyr Mnih & Koray Kavukcuoglu & David Silver & Andrei A. Rusu & Joel Veness & Marc G. Bellemare & Alex Graves & Martin Riedmiller & Andreas K. Fidjeland & Georg Ostrovski & Stig Petersen & Charle, 2015. "Human-level control through deep reinforcement learning," Nature, Nature, vol. 518(7540), pages 529-533, February.
    19. Shenle Pan & Damien Trentesaux & Duncan Mcfarlane & Benoit Montreuil & Eric Ballot & George Huang, 2021. "Digital interoperability and transformation in logistics and supply chain management: Editorial," Post-Print hal-03195695, HAL.
    20. Enayati, Shakiba & Özaltın, Osman Y., 2020. "Optimal influenza vaccine distribution with equity," European Journal of Operational Research, Elsevier, vol. 283(2), pages 714-725.
    21. Arnab Nilim & Laurent El Ghaoui, 2005. "Robust Control of Markov Decision Processes with Uncertain Transition Matrices," Operations Research, INFORMS, vol. 53(5), pages 780-798, October.
    22. Rameshwar Dubey & Angappa Gunasekaran & Thanos Papadopoulos, 2019. "Disaster relief operations: past, present and future," Annals of Operations Research, Springer, vol. 283(1), pages 1-8, December.
    23. Cheung, Kam-Fung & Bell, Michael G.H. & Bhattacharjya, Jyotirmoyee, 2021. "Cybersecurity in logistics and supply chain management: An overview and future research directions," Transportation Research Part E: Logistics and Transportation Review, Elsevier, vol. 146(C).
    24. Asadi, Amin & Nurre Pinkley, Sarah, 2021. "A stochastic scheduling, allocation, and inventory replenishment problem for battery swap stations," Transportation Research Part E: Logistics and Transportation Review, Elsevier, vol. 146(C).
    25. Li, Xueping & Wang, Jiao & Sawhney, Rapinder, 2012. "Reinforcement learning for joint pricing, lead-time and scheduling decisions in make-to-order systems," European Journal of Operational Research, Elsevier, vol. 221(1), pages 99-109.
    26. Martin, Simon & Ouelhadj, Djamila & Beullens, Patrick & Ozcan, Ender & Juan, Angel A. & Burke, Edmund K., 2016. "A multi-agent based cooperative approach to scheduling and routing," European Journal of Operational Research, Elsevier, vol. 254(1), pages 169-178.
    27. Wolfram Wiesemann & Daniel Kuhn & Berç Rustem, 2013. "Robust Markov Decision Processes," Mathematics of Operations Research, INFORMS, vol. 38(1), pages 153-183, February.
    28. Hau L. Lee & V. Padmanabhan & Seungjin Whang, 1997. "Information Distortion in a Supply Chain: The Bullwhip Effect," Management Science, INFORMS, vol. 43(4), pages 546-558, April.
    29. Giannoccaro, Ilaria & Pontrandolfo, Pierpaolo, 2002. "Inventory management in supply chains: a reinforcement learning approach," International Journal of Production Economics, Elsevier, vol. 78(2), pages 153-161, July.
    30. Choi, Tsan-Ming, 2020. "Internet based elastic logistics platforms for fashion quick response systems in the digital era," Transportation Research Part E: Logistics and Transportation Review, Elsevier, vol. 143(C).
    31. Yin, Jiateng & Tang, Tao & Yang, Lixing & Gao, Ziyou & Ran, Bin, 2016. "Energy-efficient metro train rescheduling with uncertain time-variant passenger demands: An approximate dynamic programming approach," Transportation Research Part B: Methodological, Elsevier, vol. 91(C), pages 178-210.
    32. Illhoe Hwang & Young Jae Jang, 2020. "Q(λ) learning-based dynamic route guidance algorithm for overhead hoist transport systems in semiconductor fabs," International Journal of Production Research, Taylor & Francis Journals, vol. 58(4), pages 1199-1221, February.
    33. Chen-Fu Chien & Yun-Siang Lin & Sheng-Kai Lin, 2020. "Deep reinforcement learning for selecting demand forecast models to empower Industry 3.5 and an empirical study for a semiconductor component distributor," International Journal of Production Research, Taylor & Francis Journals, vol. 58(9), pages 2784-2804, May.
    34. Ahamed, Tanvir & Zou, Bo & Farazi, Nahid Parvez & Tulabandhula, Theja, 2021. "Deep Reinforcement Learning for Crowdsourced Urban Delivery," Transportation Research Part B: Methodological, Elsevier, vol. 152(C), pages 227-257.
    35. Bruzzone, Francesco & Cavallaro, Federico & Nocera, Silvio, 2021. "The integration of passenger and freight transport for first-last mile operations," Transport Policy, Elsevier, vol. 100(C), pages 31-48.
    36. Galindo, Gina & Batta, Rajan, 2013. "Review of recent developments in OR/MS research in disaster operations management," European Journal of Operational Research, Elsevier, vol. 230(2), pages 201-211.
    37. Kyuree Ahn & Jinkyoo Park, 2021. "Cooperative zone-based rebalancing of idle overhead hoist transportations using multi-agent reinforcement learning with graph representation learning," IISE Transactions, Taylor & Francis Journals, vol. 53(10), pages 1140-1156, October.
    38. Nie, Yu (Marco) & Wu, Xing, 2009. "Shortest path problem considering on-time arrival probability," Transportation Research Part B: Methodological, Elsevier, vol. 43(6), pages 597-613, July.
    39. Liu, Shan & Jiang, Hai & Chen, Shuiping & Ye, Jing & He, Renqing & Sun, Zhizhao, 2020. "Integrating Dijkstra’s algorithm into deep inverse reinforcement learning for food delivery route planning," Transportation Research Part E: Logistics and Transportation Review, Elsevier, vol. 142(C).
    40. Chiang, Chi, 2003. "Optimal replenishment for a periodic review inventory system with two supply modes," European Journal of Operational Research, Elsevier, vol. 149(1), pages 229-244, August.
    Full references (including those not matched with items on IDEAS)

    Citations

    Citations are extracted by the CitEc Project, subscribe to its RSS feed for this item.
    as


    Cited by:

    1. Ding, Yida & Wandelt, Sebastian & Wu, Guohua & Xu, Yifan & Sun, Xiaoqian, 2023. "Towards efficient airline disruption recovery with reinforcement learning," Transportation Research Part E: Logistics and Transportation Review, Elsevier, vol. 179(C).
    2. Guo, Feng & Wei, Qu & Wang, Miao & Guo, Zhaoxia & Wallace, Stein W., 2023. "Deep attention models with dimension-reduction and gate mechanisms for solving practical time-dependent vehicle routing problems," Transportation Research Part E: Logistics and Transportation Review, Elsevier, vol. 173(C).
    3. Amine Masmoudi, M. & Mancini, Simona & Baldacci, Roberto & Kuo, Yong-Hong, 2022. "Vehicle routing problems with drones equipped with multi-package payload compartments," Transportation Research Part E: Logistics and Transportation Review, Elsevier, vol. 164(C).
    4. Wadi Khalid Anuar & Lai Soon Lee & Hsin-Vonn Seow & Stefan Pickl, 2022. "A Multi-Depot Dynamic Vehicle Routing Problem with Stochastic Road Capacity: An MDP Model and Dynamic Policy for Post-Decision State Rollout Algorithm in Reinforcement Learning," Mathematics, MDPI, vol. 10(15), pages 1-70, July.
    5. Wang, Haibo & Alidaee, Bahram, 2023. "White-glove service delivery: A quantitative analysis," Transportation Research Part E: Logistics and Transportation Review, Elsevier, vol. 175(C).
    6. Li, Huanhuan & Jiao, Hang & Yang, Zaili, 2023. "AIS data-driven ship trajectory prediction modelling and analysis based on machine learning and deep learning methods," Transportation Research Part E: Logistics and Transportation Review, Elsevier, vol. 175(C).
    7. Fang, Chao & Han, Zonglei & Wang, Wei & Zio, Enrico, 2023. "Routing UAVs in landslides Monitoring: A neural network heuristic for team orienteering with mandatory visits," Transportation Research Part E: Logistics and Transportation Review, Elsevier, vol. 175(C).
    8. Kuo, Yong-Hong & Leung, Janny M.Y. & Yan, Yimo, 2023. "Public transport for smart cities: Recent innovations and future challenges," European Journal of Operational Research, Elsevier, vol. 306(3), pages 1001-1026.

    Most related items

    These are the items that most often cite the same works as this one and are cited by the same works as this one.
    1. Xiaoyan Xu & Suresh P. Sethi & Sai‐Ho Chung & Tsan‐Ming Choi, 2023. "Reforming global supply chain management under pandemics: The GREAT‐3Rs framework," Production and Operations Management, Production and Operations Management Society, vol. 32(2), pages 524-546, February.
    2. Kong, Xiang T.R. & Kang, Kai & Zhong, Ray Y. & Luo, Hao & Xu, Su Xiu, 2021. "Cyber physical system-enabled on-demand logistics trading," International Journal of Production Economics, Elsevier, vol. 233(C).
    3. Wang, Xuekai & D’Ariano, Andrea & Su, Shuai & Tang, Tao, 2023. "Cooperative train control during the power supply shortage in metro system: A multi-agent reinforcement learning approach," Transportation Research Part B: Methodological, Elsevier, vol. 170(C), pages 244-278.
    4. Qi, Mingyao & Yang, Ying & Cheng, Chun, 2023. "Location and inventory pre-positioning problem under uncertainty," Transportation Research Part E: Logistics and Transportation Review, Elsevier, vol. 177(C).
    5. Arthur Flajolet & Sébastien Blandin & Patrick Jaillet, 2018. "Robust Adaptive Routing Under Uncertainty," Operations Research, INFORMS, vol. 66(1), pages 210-229, January.
    6. Li, Wenqing & Ni, Shaoquan, 2022. "Train timetabling with the general learning environment and multi-agent deep reinforcement learning," Transportation Research Part B: Methodological, Elsevier, vol. 157(C), pages 230-251.
    7. Xin, Linwei & Goldberg, David A., 2021. "Time (in)consistency of multistage distributionally robust inventory models with moment constraints," European Journal of Operational Research, Elsevier, vol. 289(3), pages 1127-1141.
    8. Wang, Haibo & Alidaee, Bahram, 2023. "White-glove service delivery: A quantitative analysis," Transportation Research Part E: Logistics and Transportation Review, Elsevier, vol. 175(C).
    9. Guo, Chaojie & Thompson, Russell G. & Foliente, Greg & Kong, Xiang T.R., 2021. "An auction-enabled collaborative routing mechanism for omnichannel on-demand logistics through transshipment," Transportation Research Part E: Logistics and Transportation Review, Elsevier, vol. 146(C).
    10. Martin W.P Savelsbergh & Marlin W. Ulmer, 2022. "Challenges and opportunities in crowdsourced delivery planning and operations," 4OR, Springer, vol. 20(1), pages 1-21, March.
    11. Parvez Farazi, Nahid & Zou, Bo & Tulabandhula, Theja, 2022. "Dynamic On-Demand Crowdshipping Using Constrained and Heuristics-Embedded Double Dueling Deep Q-Network," Transportation Research Part E: Logistics and Transportation Review, Elsevier, vol. 166(C).
    12. Maximilian Blesch & Philipp Eisenhauer, 2021. "Robust decision-making under risk and ambiguity," Papers 2104.12573, arXiv.org, revised Oct 2021.
    13. Dellbrügge, Marius & Brilka, Tim & Kreuz, Felix & Clausen, Uwe, 2022. "Auction design in strategic freight procurement," Chapters from the Proceedings of the Hamburg International Conference of Logistics (HICL), in: Kersten, Wolfgang & Jahn, Carlos & Blecker, Thorsten & Ringle, Christian M. (ed.), Changing Tides: The New Role of Resilience and Sustainability in Logistics and Supply Chain Management – Innovative Approaches for the Shift to a New , volume 33, pages 295-325, Hamburg University of Technology (TUHH), Institute of Business Logistics and General Management.
    14. Boute, Robert N. & Gijsbrechts, Joren & van Jaarsveld, Willem & Vanvuchelen, Nathalie, 2022. "Deep reinforcement learning for inventory control: A roadmap," European Journal of Operational Research, Elsevier, vol. 298(2), pages 401-412.
    15. Fink, Alexander A. & Klöckner, Maximilian & Räder, Tobias & Wagner, Stephan M., 2022. "Supply chain management accelerators: Types, objectives, and key design features," Transportation Research Part E: Logistics and Transportation Review, Elsevier, vol. 164(C).
    16. Li, Zhaojin & Liu, Ya & Yang, Zhen, 2021. "An effective kernel search and dynamic programming hybrid heuristic for a multimodal transportation planning problem with order consolidation," Transportation Research Part E: Logistics and Transportation Review, Elsevier, vol. 152(C).
    17. Fernando Ordóñez & Nicolás E. Stier-Moses, 2010. "Wardrop Equilibria with Risk-Averse Users," Transportation Science, INFORMS, vol. 44(1), pages 63-86, February.
    18. Fotuhi, Fateme & Huynh, Nathan & Vidal, Jose M. & Xie, Yuanchang, 2013. "Modeling yard crane operators as reinforcement learning agents," Research in Transportation Economics, Elsevier, vol. 42(1), pages 3-12.
    19. Aliakbari Sani, Sajad & Bahn, Olivier & Delage, Erick, 2022. "Affine decision rule approximation to address demand response uncertainty in smart Grids’ capacity planning," European Journal of Operational Research, Elsevier, vol. 303(1), pages 438-455.
    20. Roberto Gomes de Mattos & Fabricio Oliveira & Adriana Leiras & Abdon Baptista de Paula Filho & Paulo Gonçalves, 2019. "Robust optimization of the insecticide-treated bed nets procurement and distribution planning under uncertainty for malaria prevention and control," Annals of Operations Research, Springer, vol. 283(1), pages 1045-1078, December.

    Corrections

    All material on this site has been provided by the respective publishers and authors. You can help correct errors and omissions. When requesting a correction, please mention this item's handle: RePEc:eee:transe:v:162:y:2022:i:c:s136655452200103x. See general information about how to correct material in RePEc.

    If you have authored this item and are not yet registered with RePEc, we encourage you to do it here. This allows to link your profile to this item. It also allows you to accept potential citations to this item that we are uncertain about.

    If CitEc recognized a bibliographic reference but did not link an item in RePEc to it, you can help with this form .

    If you know of missing items citing this one, you can help us creating those links by adding the relevant references in the same way as above, for each refering item. If you are a registered author of this item, you may also want to check the "citations" tab in your RePEc Author Service profile, as there may be some citations waiting for confirmation.

    For technical questions regarding this item, or to correct its authors, title, abstract, bibliographic or download information, contact: Catherine Liu (email available below). General contact details of provider: http://www.elsevier.com/wps/find/journaldescription.cws_home/600244/description#description .

    Please note that corrections may take a couple of weeks to filter through the various RePEc services.

    IDEAS is a RePEc service. RePEc uses bibliographic data supplied by the respective publishers.