IDEAS home Printed from https://ideas.repec.org/a/wsi/ijitdm/v17y2018i03ns0219622018500086.html
   My bibliography  Save this article

On-Line Case-Based Policy Learning for Automated Planning in Probabilistic Environments

Author

Listed:
  • Moisés Martínez

    (Faculty of Natural and Mathematical Sciences, King’s College of London, Strand Campus, Bush House, 30 Aldwych, London, WC2B 4BG, United Kingdom)

  • Javier García

    (Computer Science Department, Universidad Carlos III de Madrid, Avenida de la Universidad, 30, Leganés 28911, Madrid, Spain)

  • Fernando Fernández

    (Computer Science Department, Universidad Carlos III de Madrid, Avenida de la Universidad, 30, Leganés 28911, Madrid, Spain)

Abstract

Many robotic control architectures perform a continuous cycle of sensing, reasoning and acting, where that reasoning can be carried out in a reactive or deliberative form. Reactive methods are fast and provide the robot with high interaction and response capabilities. Deliberative reasoning is particularly suitable in robotic systems because it employs some form of forward projection (reasoning in depth about goals, pre-conditions, resources and timing constraints) and provides the robot reasonable responses in situations unforeseen by the designer. However, this reasoning, typically conducted using Artificial Intelligence techniques like Automated Planning (AP), is not effective for controlling autonomous agents which operate in complex and dynamic environments. Deliberative planning, although feasible in stable situations, takes too long in unexpected or changing situations which require re-planning. Therefore, planning cannot be done on-line in many complex robotic problems, where quick responses are frequently required. In this paper, we propose an alternative approach based on case-based policy learning which integrates deliberative reasoning through AP and reactive response time through reactive planning policies. The method is based on learning planning knowledge from actual experiences to obtain a case-based policy. The contribution of this paper is two fold. First, it is shown that the learned case-based policy produces reasonable and timely responses in complex environments. Second, it is also shown how one case-based policy that solves a particular problem can be reused to solve a similar but more complex problem in a transfer learning scope.

Suggested Citation

  • Moisés Martínez & Javier García & Fernando Fernández, 2018. "On-Line Case-Based Policy Learning for Automated Planning in Probabilistic Environments," International Journal of Information Technology & Decision Making (IJITDM), World Scientific Publishing Co. Pte. Ltd., vol. 17(03), pages 763-800, May.
  • Handle: RePEc:wsi:ijitdm:v:17:y:2018:i:03:n:s0219622018500086
    DOI: 10.1142/S0219622018500086
    as

    Download full text from publisher

    File URL: http://www.worldscientific.com/doi/abs/10.1142/S0219622018500086
    Download Restriction: Access to full text is restricted to subscribers

    File URL: https://libkey.io/10.1142/S0219622018500086?utm_source=ideas
    LibKey link: if access is restricted and if your library uses this service, LibKey will redirect you to where you can use your library subscription to access this item
    ---><---

    As the access to this document is restricted, you may want to search for a different version of it.

    References listed on IDEAS

    as
    1. García, Javier & Florez, José E. & Torralba, Álvaro & Borrajo, Daniel & López, Carlos Linares & García-Olaya, Ángel & Sáenz, Juan, 2013. "Combining linear programming and automated planning to solve intermodal transportation problems," European Journal of Operational Research, Elsevier, vol. 227(1), pages 216-226.
    Full references (including those not matched with items on IDEAS)

    Most related items

    These are the items that most often cite the same works as this one and are cited by the same works as this one.
    1. SteadieSeifi, M. & Dellaert, N.P. & Nuijten, W. & Van Woensel, T., 2017. "A metaheuristic for the multimodal network flow problem with product quality preservation and empty repositioning," Transportation Research Part B: Methodological, Elsevier, vol. 106(C), pages 321-344.
    2. Subrata Mitra & Balram Avittathur, 2018. "Application of linear programming in optimizing the procurement and movement of coal for an Indian coal-fired power-generating company," DECISION: Official Journal of the Indian Institute of Management Calcutta, Springer;Indian Institute of Management Calcutta, vol. 45(3), pages 207-224, September.
    3. Fabio Vitor & Todd Easton, 2018. "The double pivot simplex method," Mathematical Methods of Operations Research, Springer;Gesellschaft für Operations Research (GOR);Nederlands Genootschap voor Besliskunde (NGB), vol. 87(1), pages 109-137, February.
    4. Baykasoğlu, Adil & Subulan, Kemal, 2016. "A multi-objective sustainable load planning model for intermodal transportation networks with a real-life application," Transportation Research Part E: Logistics and Transportation Review, Elsevier, vol. 95(C), pages 207-247.
    5. Elías Escobar-Gómez & J.L. Camas-Anzueto & Sabino Velázquez-Trujillo & Héctor Hernández-de-León & Rubén Grajales-Coutiño & Eduardo Chandomí-Castellanos & Héctor Guerra-Crespo, 2019. "A Linear Programming Model with Fuzzy Arc for Route Optimization in the Urban Road Network," Sustainability, MDPI, vol. 11(23), pages 1-18, November.

    Corrections

    All material on this site has been provided by the respective publishers and authors. You can help correct errors and omissions. When requesting a correction, please mention this item's handle: RePEc:wsi:ijitdm:v:17:y:2018:i:03:n:s0219622018500086. See general information about how to correct material in RePEc.

    If you have authored this item and are not yet registered with RePEc, we encourage you to do it here. This allows to link your profile to this item. It also allows you to accept potential citations to this item that we are uncertain about.

    If CitEc recognized a bibliographic reference but did not link an item in RePEc to it, you can help with this form .

    If you know of missing items citing this one, you can help us creating those links by adding the relevant references in the same way as above, for each refering item. If you are a registered author of this item, you may also want to check the "citations" tab in your RePEc Author Service profile, as there may be some citations waiting for confirmation.

    For technical questions regarding this item, or to correct its authors, title, abstract, bibliographic or download information, contact: Tai Tone Lim (email available below). General contact details of provider: http://www.worldscinet.com/ijitdm/ijitdm.shtml .

    Please note that corrections may take a couple of weeks to filter through the various RePEc services.

    IDEAS is a RePEc service. RePEc uses bibliographic data supplied by the respective publishers.