IDEAS home Printed from https://ideas.repec.org/a/inm/ormoor/v47y2022i4p2815-2839.html
   My bibliography  Save this article

Satisficing in Time-Sensitive Bandit Learning

Author

Listed:
  • Daniel Russo

    (Columbia Business School, Columbia University, New York, New York 10027)

  • Benjamin Van Roy

    (Department of Electrical Engineering and Department of Management Science and Engineering, Stanford University, Stanford, California 94305)

Abstract

Much of the recent literature on bandit learning focuses on algorithms that aim to converge on an optimal action. One shortcoming is that this orientation does not account for time sensitivity, which can play a crucial role when learning an optimal action requires much more information than near-optimal ones. Indeed, popular approaches, such as upper-confidence-bound methods and Thompson sampling, can fare poorly in such situations. We consider instead learning a satisficing action , which is near-optimal while requiring less information, and propose satisficing Thompson sampling , an algorithm that serves this purpose. We establish a general bound on expected discounted regret and study the application of satisficing Thompson sampling to linear and infinite-armed bandits, demonstrating arbitrarily large benefits over Thompson sampling. We also discuss the relation between the notion of satisficing and the theory of rate distortion, which offers guidance on the selection of satisficing actions.

Suggested Citation

  • Daniel Russo & Benjamin Van Roy, 2022. "Satisficing in Time-Sensitive Bandit Learning," Mathematics of Operations Research, INFORMS, vol. 47(4), pages 2815-2839, November.
  • Handle: RePEc:inm:ormoor:v:47:y:2022:i:4:p:2815-2839
    DOI: 10.1287/moor.2021.1229
    as

    Download full text from publisher

    File URL: http://dx.doi.org/10.1287/moor.2021.1229
    Download Restriction: no

    File URL: https://libkey.io/10.1287/moor.2021.1229?utm_source=ideas
    LibKey link: if access is restricted and if your library uses this service, LibKey will redirect you to where you can use your library subscription to access this item
    ---><---

    Corrections

    All material on this site has been provided by the respective publishers and authors. You can help correct errors and omissions. When requesting a correction, please mention this item's handle: RePEc:inm:ormoor:v:47:y:2022:i:4:p:2815-2839. See general information about how to correct material in RePEc.

    If you have authored this item and are not yet registered with RePEc, we encourage you to do it here. This allows to link your profile to this item. It also allows you to accept potential citations to this item that we are uncertain about.

    We have no bibliographic references for this item. You can help adding them by using this form .

    If you know of missing items citing this one, you can help us creating those links by adding the relevant references in the same way as above, for each refering item. If you are a registered author of this item, you may also want to check the "citations" tab in your RePEc Author Service profile, as there may be some citations waiting for confirmation.

    For technical questions regarding this item, or to correct its authors, title, abstract, bibliographic or download information, contact: Chris Asher (email available below). General contact details of provider: https://edirc.repec.org/data/inforea.html .

    Please note that corrections may take a couple of weeks to filter through the various RePEc services.

    IDEAS is a RePEc service. RePEc uses bibliographic data supplied by the respective publishers.