IDEAS home Printed from https://ideas.repec.org/a/hin/jnlmpe/561026.html
   My bibliography  Save this article

A Sarsa( λ ) Algorithm Based on Double-Layer Fuzzy Reasoning

Author

Listed:
  • Quan Liu
  • Xiang Mu
  • Wei Huang
  • Qiming Fu
  • Yonggang Zhang

Abstract

Solving reinforcement learning problems in continuous space with function approximation is currently a research hotspot of machine learning. When dealing with the continuous space problems, the classic Q -iteration algorithms based on lookup table or function approximation converge slowly and are difficult to derive a continuous policy. To overcome the above weaknesses, we propose an algorithm named DFR-Sarsa( λ ) based on double-layer fuzzy reasoning and prove its convergence. In this algorithm, the first reasoning layer uses fuzzy sets of state to compute continuous actions; the second reasoning layer uses fuzzy sets of action to compute the components of Q -value. Then, these two fuzzy layers are combined to compute the Q -value function of continuous action space. Besides, this algorithm utilizes the membership degrees of activation rules in the two fuzzy reasoning layers to update the eligibility traces. Applying DFR-Sarsa( λ ) to the Mountain Car and Cart-pole Balancing problems, experimental results show that the algorithm not only can be used to get a continuous action policy, but also has a better convergence performance.

Suggested Citation

  • Quan Liu & Xiang Mu & Wei Huang & Qiming Fu & Yonggang Zhang, 2013. "A Sarsa( λ ) Algorithm Based on Double-Layer Fuzzy Reasoning," Mathematical Problems in Engineering, Hindawi, vol. 2013, pages 1-9, December.
  • Handle: RePEc:hin:jnlmpe:561026
    DOI: 10.1155/2013/561026
    as

    Download full text from publisher

    File URL: http://downloads.hindawi.com/journals/MPE/2013/561026.pdf
    Download Restriction: no

    File URL: http://downloads.hindawi.com/journals/MPE/2013/561026.xml
    Download Restriction: no

    File URL: https://libkey.io/10.1155/2013/561026?utm_source=ideas
    LibKey link: if access is restricted and if your library uses this service, LibKey will redirect you to where you can use your library subscription to access this item
    ---><---

    More about this item

    Statistics

    Access and download statistics

    Corrections

    All material on this site has been provided by the respective publishers and authors. You can help correct errors and omissions. When requesting a correction, please mention this item's handle: RePEc:hin:jnlmpe:561026. See general information about how to correct material in RePEc.

    If you have authored this item and are not yet registered with RePEc, we encourage you to do it here. This allows to link your profile to this item. It also allows you to accept potential citations to this item that we are uncertain about.

    We have no bibliographic references for this item. You can help adding them by using this form .

    If you know of missing items citing this one, you can help us creating those links by adding the relevant references in the same way as above, for each refering item. If you are a registered author of this item, you may also want to check the "citations" tab in your RePEc Author Service profile, as there may be some citations waiting for confirmation.

    For technical questions regarding this item, or to correct its authors, title, abstract, bibliographic or download information, contact: Mohamed Abdelhakeem (email available below). General contact details of provider: https://www.hindawi.com .

    Please note that corrections may take a couple of weeks to filter through the various RePEc services.

    IDEAS is a RePEc service. RePEc uses bibliographic data supplied by the respective publishers.