IDEAS home Printed from https://ideas.repec.org/a/inm/ormoor/v47y2022i3p2186-2218.html
   My bibliography  Save this article

Small-Loss Bounds for Online Learning with Partial Information

Author

Listed:
  • Thodoris Lykouris

    (Massachusetts Institute of Technology, Cambridge, Massachusetts 02139)

  • Karthik Sridharan

    (Cornell University, Ithaca, New York 14850)

  • Éva Tardos

    (Cornell University, Ithaca, New York 14850)

Abstract

We consider the problem of adversarial (nonstochastic) online learning with partial-information feedback, in which, at each round, a decision maker selects an action from a finite set of alternatives. We develop a black-box approach for such problems in which the learner observes as feedback only losses of a subset of the actions that includes the selected action. When losses of actions are nonnegative, under the graph-based feedback model introduced by Mannor and Shamir, we offer algorithms that attain the so called “small-loss” o ( α L ⋆ ) regret bounds with high probability, where α is the independence number of the graph and L ⋆ is the loss of the best action. Prior to our work, there was no data-dependent guarantee for general feedback graphs even for pseudo-regret (without dependence on the number of actions, i.e., utilizing the increased information feedback). Taking advantage of the black-box nature of our technique, we extend our results to many other applications, such as combinatorial semi-bandits (including routing in networks), contextual bandits (even with an infinite comparator class), and learning with slowly changing (shifting) comparators. In the special case of multi-armed bandit and combinatorial semi-bandit problems, we provide optimal small-loss, high-probability regret guarantees of O ˜ ( d L ⋆ ) , where d is the number of actions, answering open questions of Neu. Previous bounds for multi-armed bandits and semi-bandits were known only for pseudo-regret and only in expectation. We also offer an optimal O ˜ ( κ L ⋆ ) regret guarantee for fixed feedback graphs with clique-partition number at most κ .

Suggested Citation

  • Thodoris Lykouris & Karthik Sridharan & Éva Tardos, 2022. "Small-Loss Bounds for Online Learning with Partial Information," Mathematics of Operations Research, INFORMS, vol. 47(3), pages 2186-2218, August.
  • Handle: RePEc:inm:ormoor:v:47:y:2022:i:3:p:2186-2218
    DOI: 10.1287/moor.2021.1204
    as

    Download full text from publisher

    File URL: http://dx.doi.org/10.1287/moor.2021.1204
    Download Restriction: no

    File URL: https://libkey.io/10.1287/moor.2021.1204?utm_source=ideas
    LibKey link: if access is restricted and if your library uses this service, LibKey will redirect you to where you can use your library subscription to access this item
    ---><---

    Corrections

    All material on this site has been provided by the respective publishers and authors. You can help correct errors and omissions. When requesting a correction, please mention this item's handle: RePEc:inm:ormoor:v:47:y:2022:i:3:p:2186-2218. See general information about how to correct material in RePEc.

    If you have authored this item and are not yet registered with RePEc, we encourage you to do it here. This allows to link your profile to this item. It also allows you to accept potential citations to this item that we are uncertain about.

    We have no bibliographic references for this item. You can help adding them by using this form .

    If you know of missing items citing this one, you can help us creating those links by adding the relevant references in the same way as above, for each refering item. If you are a registered author of this item, you may also want to check the "citations" tab in your RePEc Author Service profile, as there may be some citations waiting for confirmation.

    For technical questions regarding this item, or to correct its authors, title, abstract, bibliographic or download information, contact: Chris Asher (email available below). General contact details of provider: https://edirc.repec.org/data/inforea.html .

    Please note that corrections may take a couple of weeks to filter through the various RePEc services.

    IDEAS is a RePEc service. RePEc uses bibliographic data supplied by the respective publishers.