IDEAS home Printed from https://ideas.repec.org/a/eee/ejores/v309y2023i1p446-468.html
   My bibliography  Save this article

A general deep reinforcement learning hyperheuristic framework for solving combinatorial optimization problems

Author

Listed:
  • Kallestad, Jakob
  • Hasibi, Ramin
  • Hemmati, Ahmad
  • Sörensen, Kenneth

Abstract

Many problem-specific heuristic frameworks have been developed to solve combinatorial optimization problems, but these frameworks do not generalize well to other problem domains. Metaheuristic frameworks aim to be more generalizable compared to traditional heuristics, however their performances suffer from poor selection of low-level heuristics (operators) during the search process. An example of heuristic selection in a metaheuristic framework is the adaptive layer of the popular framework of Adaptive Large Neighborhood Search (ALNS). Here, we propose a selection hyperheuristic framework that uses Deep Reinforcement Learning (Deep RL) as an alternative to the adaptive layer of ALNS. Unlike the adaptive layer which only considers heuristics’ past performance for future selection, a Deep RL agent is able to take into account additional information from the search process, e.g., the difference in objective value between iterations, to make better decisions. This is due to the representation power of Deep Learning methods and the decision making capability of the Deep RL agent which can learn to adapt to different problems and instance characteristics. In this paper, by integrating the Deep RL agent into the ALNS framework, we introduce Deep Reinforcement Learning Hyperheuristic (DRLH), a general framework for solving a wide variety of combinatorial optimization problems and show that our framework is better at selecting low-level heuristics at each step of the search process compared to ALNS and a Uniform Random Selection (URS). Our experiments also show that while ALNS can not properly handle a large pool of heuristics, DRLH is not negatively affected by increasing the number of heuristics.

Suggested Citation

  • Kallestad, Jakob & Hasibi, Ramin & Hemmati, Ahmad & Sörensen, Kenneth, 2023. "A general deep reinforcement learning hyperheuristic framework for solving combinatorial optimization problems," European Journal of Operational Research, Elsevier, vol. 309(1), pages 446-468.
  • Handle: RePEc:eee:ejores:v:309:y:2023:i:1:p:446-468
    DOI: 10.1016/j.ejor.2023.01.017
    as

    Download full text from publisher

    File URL: http://www.sciencedirect.com/science/article/pii/S037722172300036X
    Download Restriction: Full text for ScienceDirect subscribers only

    File URL: https://libkey.io/10.1016/j.ejor.2023.01.017?utm_source=ideas
    LibKey link: if access is restricted and if your library uses this service, LibKey will redirect you to where you can use your library subscription to access this item
    ---><---

    As the access to this document is restricted, you may want to search for a different version of it.

    References listed on IDEAS

    as
    1. Grangier, Philippe & Gendreau, Michel & Lehuédé, Fabien & Rousseau, Louis-Martin, 2016. "An adaptive large neighborhood search for the two-echelon multiple-trip vehicle routing problem with satellite synchronization," European Journal of Operational Research, Elsevier, vol. 254(1), pages 80-91.
    2. Stefan Ropke & David Pisinger, 2006. "An Adaptive Large Neighborhood Search Heuristic for the Pickup and Delivery Problem with Time Windows," Transportation Science, INFORMS, vol. 40(4), pages 455-472, November.
    3. Ender Özcan & Mustafa Misir & Gabriela Ochoa & Edmund K. Burke, 2010. "A Reinforcement Learning - Great-Deluge Hyper-Heuristic for Examination Timetabling," International Journal of Applied Metaheuristic Computing (IJAMC), IGI Global, vol. 1(1), pages 39-59, January.
    4. Crama, Y. & Schyns, M., 2003. "Simulated annealing for complex portfolio selection problems," European Journal of Operational Research, Elsevier, vol. 150(3), pages 546-571, November.
    5. David Pisinger & Stefan Ropke, 2019. "Large Neighborhood Search," International Series in Operations Research & Management Science, in: Michel Gendreau & Jean-Yves Potvin (ed.), Handbook of Metaheuristics, edition 3, chapter 0, pages 99-127, Springer.
    6. López-Ibáñez, Manuel & Dubois-Lacoste, Jérémie & Pérez Cáceres, Leslie & Birattari, Mauro & Stützle, Thomas, 2016. "The irace package: Iterated racing for automatic algorithm configuration," Operations Research Perspectives, Elsevier, vol. 3(C), pages 43-58.
    7. Homsi, Gabriel & Martinelli, Rafael & Vidal, Thibaut & Fagerholt, Kjetil, 2020. "Industrial and tramp ship routing problems: Closing the gap for real-scale instances," European Journal of Operational Research, Elsevier, vol. 283(3), pages 972-990.
    8. Edmund K. Burke & Matthew Hyde & Graham Kendall & Gabriela Ochoa & Ender Özcan & John R. Woodward, 2010. "A Classification of Hyper-heuristic Approaches," International Series in Operations Research & Management Science, in: Michel Gendreau & Jean-Yves Potvin (ed.), Handbook of Metaheuristics, chapter 0, pages 449-468, Springer.
    9. Gullhav, Anders N. & Cordeau, Jean-François & Hvattum, Lars Magnus & Nygreen, Bjørn, 2017. "Adaptive large neighborhood search heuristics for multi-tier service deployment problems in clouds," European Journal of Operational Research, Elsevier, vol. 259(3), pages 829-846.
    10. Demir, Emrah & Bektaş, Tolga & Laporte, Gilbert, 2012. "An adaptive large neighborhood search heuristic for the Pollution-Routing Problem," European Journal of Operational Research, Elsevier, vol. 223(2), pages 346-359.
    11. Li, Yuan & Chen, Haoxun & Prins, Christian, 2016. "Adaptive large neighborhood search for the pickup and delivery problem with time windows, profits, and reserved requests," European Journal of Operational Research, Elsevier, vol. 252(1), pages 27-38.
    12. Turkeš, Renata & Sörensen, Kenneth & Hvattum, Lars Magnus, 2021. "Meta-analysis of metaheuristics: Quantifying the effect of adaptiveness in adaptive large neighborhood search," European Journal of Operational Research, Elsevier, vol. 292(2), pages 423-442.
    13. Aksen, Deniz & Kaya, Onur & Sibel Salman, F. & Tüncel, Özge, 2014. "An adaptive large neighborhood search algorithm for a selective and periodic inventory routing problem," European Journal of Operational Research, Elsevier, vol. 239(2), pages 413-426.
    14. Chen, Cheng & Demir, Emrah & Huang, Yuan, 2021. "An adaptive large neighborhood search heuristic for the vehicle routing problem with time windows and delivery robots," European Journal of Operational Research, Elsevier, vol. 294(3), pages 1164-1180.
    Full references (including those not matched with items on IDEAS)

    Citations

    Citations are extracted by the CitEc Project, subscribe to its RSS feed for this item.
    as


    Cited by:

    1. Yen, Benjamin P.-C. & Luo, Yu, 2023. "Navigational guidance – A deep learning approach," European Journal of Operational Research, Elsevier, vol. 310(3), pages 1179-1191.

    Most related items

    These are the items that most often cite the same works as this one and are cited by the same works as this one.
    1. Turkeš, Renata & Sörensen, Kenneth & Hvattum, Lars Magnus, 2021. "Meta-analysis of metaheuristics: Quantifying the effect of adaptiveness in adaptive large neighborhood search," European Journal of Operational Research, Elsevier, vol. 292(2), pages 423-442.
    2. TURKEŠ, Renata & SÖRENSEN, Kenneth & HVATTUM, Lars Magnus & BARRENA, Eva & CHENTLI, Hayet & COELHO, Leandro & DAYARIAN, Iman & GRIMAULT, Axel & GULLHAVE, Anders & IRIS, Çagatay & KESKIN, Merve & KIEFE, 2019. "Meta-analysis of metaheuristics: Quantifying the effect of adaptiveness in adaptive large neighborhood search," Working Papers 2019002, University of Antwerp, Faculty of Business and Economics.
    3. Singh, Nitish & Dang, Quang-Vinh & Akcay, Alp & Adan, Ivo & Martagan, Tugce, 2022. "A matheuristic for AGV scheduling with battery constraints," European Journal of Operational Research, Elsevier, vol. 298(3), pages 855-873.
    4. Yu, Vincent F. & Anh, Pham Tuan & Baldacci, Roberto, 2023. "A robust optimization approach for the vehicle routing problem with cross-docking under demand uncertainty," Transportation Research Part E: Logistics and Transportation Review, Elsevier, vol. 173(C).
    5. Huang, Baobin & Tang, Lixin & Baldacci, Roberto & Wang, Gongshu & Sun, Defeng, 2023. "A metaheuristic algorithm for a locomotive routing problem arising in the steel industry," European Journal of Operational Research, Elsevier, vol. 308(1), pages 385-399.
    6. Dumez, Dorian & Tilk, Christian & Irnich, Stefan & Lehuédé, Fabien & Olkis, Katharina & Péton, Olivier, 2023. "A matheuristic for a 2-echelon vehicle routing problem with capacitated satellites and reverse flows," European Journal of Operational Research, Elsevier, vol. 305(1), pages 64-84.
    7. Zhang, Yimeng & Li, Xinlei & van Hassel, Edwin & Negenborn, Rudy R. & Atasoy, Bilge, 2022. "Synchromodal transport planning considering heterogeneous and vague preferences of shippers," Transportation Research Part E: Logistics and Transportation Review, Elsevier, vol. 164(C).
    8. Dumez, Dorian & Lehuédé, Fabien & Péton, Olivier, 2021. "A large neighborhood search approach to the vehicle routing problem with delivery options," Transportation Research Part B: Methodological, Elsevier, vol. 144(C), pages 103-132.
    9. Li, Hongqi & Wang, Haotian & Chen, Jun & Bai, Ming, 2020. "Two-echelon vehicle routing problem with time windows and mobile satellites," Transportation Research Part B: Methodological, Elsevier, vol. 138(C), pages 179-201.
    10. Frey, Christian M.M. & Jungwirth, Alexander & Frey, Markus & Kolisch, Rainer, 2023. "The vehicle routing problem with time windows and flexible delivery locations," European Journal of Operational Research, Elsevier, vol. 308(3), pages 1142-1159.
    11. Alberto Santini & Stefan Ropke & Lars Magnus Hvattum, 2018. "A comparison of acceptance criteria for the adaptive large neighbourhood search metaheuristic," Journal of Heuristics, Springer, vol. 24(5), pages 783-815, October.
    12. Arjun Paul & Ravi Shankar Kumar & Chayanika Rout & Adrijit Goswami, 2021. "A bi-objective two-echelon pollution routing problem with simultaneous pickup and delivery under multiple time windows constraint," OPSEARCH, Springer;Operational Research Society of India, vol. 58(4), pages 962-993, December.
    13. Hammami, Farouk & Rekik, Monia & Coelho, Leandro C., 2019. "Exact and heuristic solution approaches for the bid construction problem in transportation procurement auctions with a heterogeneous fleet," Transportation Research Part E: Logistics and Transportation Review, Elsevier, vol. 127(C), pages 150-177.
    14. Liu, Yiming & Roberto, Baldacci & Zhou, Jianwen & Yu, Yang & Zhang, Yu & Sun, Wei, 2023. "Efficient feasibility checks and an adaptive large neighborhood search algorithm for the time-dependent green vehicle routing problem with time windows," European Journal of Operational Research, Elsevier, vol. 310(1), pages 133-155.
    15. Dayarian, Iman & Crainic, Teodor Gabriel & Gendreau, Michel & Rei, Walter, 2016. "An adaptive large-neighborhood search heuristic for a multi-period vehicle routing problem," Transportation Research Part E: Logistics and Transportation Review, Elsevier, vol. 95(C), pages 95-123.
    16. Franceschetti, Anna & Demir, Emrah & Honhon, Dorothée & Van Woensel, Tom & Laporte, Gilbert & Stobbe, Mark, 2017. "A metaheuristic for the time-dependent pollution-routing problem," European Journal of Operational Research, Elsevier, vol. 259(3), pages 972-991.
    17. Li, Hongqi & Wang, Haotian & Chen, Jun & Bai, Ming, 2021. "Two-echelon vehicle routing problem with satellite bi-synchronization," European Journal of Operational Research, Elsevier, vol. 288(3), pages 775-793.
    18. Yu, Shaohua & Puchinger, Jakob & Sun, Shudong, 2022. "Van-based robot hybrid pickup and delivery routing problem," European Journal of Operational Research, Elsevier, vol. 298(3), pages 894-914.
    19. Chen, Cheng & Demir, Emrah & Huang, Yuan, 2021. "An adaptive large neighborhood search heuristic for the vehicle routing problem with time windows and delivery robots," European Journal of Operational Research, Elsevier, vol. 294(3), pages 1164-1180.
    20. Yin, Jiateng & D’Ariano, Andrea & Wang, Yihui & Yang, Lixing & Tang, Tao, 2021. "Timetable coordination in a rail transit network with time-dependent passenger demand," European Journal of Operational Research, Elsevier, vol. 295(1), pages 183-202.

    Corrections

    All material on this site has been provided by the respective publishers and authors. You can help correct errors and omissions. When requesting a correction, please mention this item's handle: RePEc:eee:ejores:v:309:y:2023:i:1:p:446-468. See general information about how to correct material in RePEc.

    If you have authored this item and are not yet registered with RePEc, we encourage you to do it here. This allows to link your profile to this item. It also allows you to accept potential citations to this item that we are uncertain about.

    If CitEc recognized a bibliographic reference but did not link an item in RePEc to it, you can help with this form .

    If you know of missing items citing this one, you can help us creating those links by adding the relevant references in the same way as above, for each refering item. If you are a registered author of this item, you may also want to check the "citations" tab in your RePEc Author Service profile, as there may be some citations waiting for confirmation.

    For technical questions regarding this item, or to correct its authors, title, abstract, bibliographic or download information, contact: Catherine Liu (email available below). General contact details of provider: http://www.elsevier.com/locate/eor .

    Please note that corrections may take a couple of weeks to filter through the various RePEc services.

    IDEAS is a RePEc service. RePEc uses bibliographic data supplied by the respective publishers.