IDEAS home Printed from https://ideas.repec.org/a/spr/comgts/v21y2024i1d10.1007_s10287-023-00500-z.html
   My bibliography  Save this article

Implicitly normalized forecaster with clipping for linear and non-linear heavy-tailed multi-armed bandits

Author

Listed:
  • Yuriy Dorn

    (MSU Institute for Artificial Intelligence
    Moscow Institute of Physics and Technology
    Institute for Information Transmission Problems)

  • Nikita Kornilov

    (Moscow Institute of Physics and Technology)

  • Nikolay Kutuzov

    (Moscow Institute of Physics and Technology)

  • Alexander Nazin

    (V.A. Trapeznikov Institute of Control Sciences of Russian Academy of Sciences)

  • Eduard Gorbunov

    (Mohamed bin Zayed University of Artificial Intelligence)

  • Alexander Gasnikov

    (Moscow Institute of Physics and Technology
    Skoltech
    ISP RAS Research Center for Trusted Artificial Intelligence)

Abstract

The Implicitly Normalized Forecaster (INF) algorithm is considered to be an optimal solution for adversarial multi-armed bandit (MAB) problems. However, most of the existing complexity results for INF rely on restrictive assumptions, such as bounded rewards. Recently, a related algorithm was proposed that works for both adversarial and stochastic heavy-tailed MAB settings. However, this algorithm fails to fully exploit the available data. In this paper, we propose a new version of INF called the Implicitly Normalized Forecaster with clipping (INF-clip) for MAB problems with heavy-tailed reward distributions. We establish convergence results under mild assumptions on the rewards distribution and demonstrate that INF-clip is optimal for linear heavy-tailed stochastic MAB problems and works well for non-linear ones. Furthermore, we show that INF-clip outperforms the best-of-both-worlds algorithm in cases where it is difficult to distinguish between different arms.

Suggested Citation

  • Yuriy Dorn & Nikita Kornilov & Nikolay Kutuzov & Alexander Nazin & Eduard Gorbunov & Alexander Gasnikov, 2024. "Implicitly normalized forecaster with clipping for linear and non-linear heavy-tailed multi-armed bandits," Computational Management Science, Springer, vol. 21(1), pages 1-29, June.
  • Handle: RePEc:spr:comgts:v:21:y:2024:i:1:d:10.1007_s10287-023-00500-z
    DOI: 10.1007/s10287-023-00500-z
    as

    Download full text from publisher

    File URL: http://link.springer.com/10.1007/s10287-023-00500-z
    File Function: Abstract
    Download Restriction: Access to the full text of the articles in this series is restricted.

    File URL: https://libkey.io/10.1007/s10287-023-00500-z?utm_source=ideas
    LibKey link: if access is restricted and if your library uses this service, LibKey will redirect you to where you can use your library subscription to access this item
    ---><---

    As the access to this document is restricted, you may want to search for a different version of it.

    Corrections

    All material on this site has been provided by the respective publishers and authors. You can help correct errors and omissions. When requesting a correction, please mention this item's handle: RePEc:spr:comgts:v:21:y:2024:i:1:d:10.1007_s10287-023-00500-z. See general information about how to correct material in RePEc.

    If you have authored this item and are not yet registered with RePEc, we encourage you to do it here. This allows to link your profile to this item. It also allows you to accept potential citations to this item that we are uncertain about.

    We have no bibliographic references for this item. You can help adding them by using this form .

    If you know of missing items citing this one, you can help us creating those links by adding the relevant references in the same way as above, for each refering item. If you are a registered author of this item, you may also want to check the "citations" tab in your RePEc Author Service profile, as there may be some citations waiting for confirmation.

    For technical questions regarding this item, or to correct its authors, title, abstract, bibliographic or download information, contact: Sonal Shukla or Springer Nature Abstracting and Indexing (email available below). General contact details of provider: http://www.springer.com .

    Please note that corrections may take a couple of weeks to filter through the various RePEc services.

    IDEAS is a RePEc service. RePEc uses bibliographic data supplied by the respective publishers.