Author
Listed:
- Antoni Guerrero
(Production Management and Engineering Research Centre, Universitat Politècnica de València, Plz. Ferrandiz-Salvador, 03801 Alcoy, Spain
Baobab Soluciones, 55 Jose Abascal, 28003 Madrid, Spain)
- Marc Escoto
(Production Management and Engineering Research Centre, Universitat Politècnica de València, Plz. Ferrandiz-Salvador, 03801 Alcoy, Spain)
- Majsa Ammouriova
(School of Applied Technical Sciences, German Jordanian University, Amman 11180, Jordan
Computer Science Department, Universitat Oberta de Catalunya, 156 Rambla Poblenou, 08018 Barcelona, Spain)
- Yangchongyi Men
(Production Management and Engineering Research Centre, Universitat Politècnica de València, Plz. Ferrandiz-Salvador, 03801 Alcoy, Spain)
- Angel A. Juan
(Production Management and Engineering Research Centre, Universitat Politècnica de València, Plz. Ferrandiz-Salvador, 03801 Alcoy, Spain
Euncet Business School, Universitat Politècnica de Catalunya 1 Cami Mas Rubial, 08225 Terrassa, Spain)
Abstract
This paper presents a reinforcement learning (RL) approach for solving the team orienteering problem under both deterministic and dynamic travel time conditions. The proposed method builds on the transformer architecture and is trained to construct routes that adapt to real-time variations, such as traffic and environmental changes. A key contribution of this work is the model’s ability to generalize across problem instances with varying numbers of nodes and vehicles, eliminating the need for retraining when problem size changes. To assess performance, a comprehensive set of experiments involving 27,000 synthetic instances is conducted, comparing the RL model with a variable neighborhood search metaheuristic. The results indicate that the RL model achieves competitive solution quality while requiring significantly less computational time. Moreover, the RL approach consistently produces feasible solutions across all dynamic instances, demonstrating strong robustness in meeting time constraints. These findings suggest that learning-based methods can offer efficient, scalable, and adaptable solutions for routing problems in dynamic and uncertain environments.
Suggested Citation
Antoni Guerrero & Marc Escoto & Majsa Ammouriova & Yangchongyi Men & Angel A. Juan, 2025.
"Using Transformers and Reinforcement Learning for the Team Orienteering Problem Under Dynamic Conditions,"
Mathematics, MDPI, vol. 13(14), pages 1-19, July.
Handle:
RePEc:gam:jmathe:v:13:y:2025:i:14:p:2313-:d:1705769
Download full text from publisher
Corrections
All material on this site has been provided by the respective publishers and authors. You can help correct errors and omissions. When requesting a correction, please mention this item's handle: RePEc:gam:jmathe:v:13:y:2025:i:14:p:2313-:d:1705769. See general information about how to correct material in RePEc.
If you have authored this item and are not yet registered with RePEc, we encourage you to do it here. This allows to link your profile to this item. It also allows you to accept potential citations to this item that we are uncertain about.
We have no bibliographic references for this item. You can help adding them by using this form .
If you know of missing items citing this one, you can help us creating those links by adding the relevant references in the same way as above, for each refering item. If you are a registered author of this item, you may also want to check the "citations" tab in your RePEc Author Service profile, as there may be some citations waiting for confirmation.
For technical questions regarding this item, or to correct its authors, title, abstract, bibliographic or download information, contact: MDPI Indexing Manager (email available below). General contact details of provider: https://www.mdpi.com .
Please note that corrections may take a couple of weeks to filter through
the various RePEc services.