The Irrevocable Multiarmed Bandit Problem

My bibliography Save this article

The Irrevocable Multiarmed Bandit Problem

Author

Listed:

Vivek F. Farias
(Operations Research Center, Massachusetts Institute of Technology, Cambridge, Massachusetts 02139)
Ritesh Madan
(Qualcomm New Jersey Research Center (NJRC), Bridgewater, New Jersey 08807)

Registered:

Abstract

This paper considers the multiarmed bandit problem with multiple simultaneous arm pulls and the additional restriction that we do not allow recourse to arms that were pulled at some point in the past but then discarded. This additional restriction is highly desirable from an operational perspective, and we refer to this problem as the “irrevocable multiarmed bandit” problem. We observe that natural modifications to well-known heuristics for multiarmed bandit problems that satisfy this irrevocability constraint have unsatisfactory performance and, thus motivated, introduce a new heuristic: the “packing” heuristic. We establish through numerical experiments that the packing heuristic offers excellent performance, even relative to heuristics that are not constrained to be irrevocable. We also provide a theoretical analysis that studies the “price” of irrevocability, i.e., the performance loss incurred in imposing the constraint we propose on the multiarmed bandit model. We show that this performance loss is uniformly bounded for a general class of multiarmed bandit problems and indicate its dependence on various problem parameters. Finally, we obtain a computationally fast algorithm to implement the packing heuristic; the algorithm renders the packing heuristic computationally cheaper than methods that rely on the computation of Gittins indices.

Suggested Citation

Vivek F. Farias & Ritesh Madan, 2011. "The Irrevocable Multiarmed Bandit Problem," Operations Research, INFORMS, vol. 59(2), pages 383-399, April.

Handle: RePEc:inm:oropre:v:59:y:2011:i:2:p:383-399
DOI: 10.1287/opre.1100.0891

Download full text from publisher

References listed on IDEAS

Dimitris Bertsimas & José Niño-Mora, 2000. "Restless Bandits, Linear Programming Relaxations, and a Primal-Dual Index Heuristic," Operations Research, INFORMS, vol. 48(1), pages 80-90, February.
Dimitris Bertsimas & Adam J. Mersereau, 2007. "A Learning Approach for Interactive Marketing to a Customer Segment," Operations Research, INFORMS, vol. 55(6), pages 1120-1135, December.
Felipe Caro & Jérémie Gallien, 2007. "Dynamic Assortment with Demand Learning for Seasonal Consumer Goods," Management Science, INFORMS, vol. 53(2), pages 276-292, February.
Brian C. Dean & Michel X. Goemans & Jan Vondrák, 2008. "Approximating the Stochastic Knapsack Problem: The Benefit of Adaptivity," Mathematics of Operations Research, INFORMS, vol. 33(4), pages 945-964, November.

Full references (including those not matched with items on IDEAS)

Citations

Citations are extracted by the CitEc Project, subscribe to its RSS feed for this item.

Cited by:

Will Ma, 2018. "Improvements and Generalizations of Stochastic Knapsack and Markovian Bandits Approximation Algorithms," Mathematics of Operations Research, INFORMS, vol. 43(3), pages 789-812, August.
David B. Brown & James E. Smith, 2013. "Optimal Sequential Exploration: Bandits, Clairvoyants, and Wildcats," Operations Research, INFORMS, vol. 61(3), pages 644-665, June.
Denis Sauré & Assaf Zeevi, 2013. "Optimal Dynamic Assortment Planning with Demand Learning," Manufacturing & Service Operations Management, INFORMS, vol. 15(3), pages 387-404, July.
Alessandro Arlotto & Stephen E. Chick & Noah Gans, 2014. "Optimal Hiring and Retention Policies for Heterogeneous Workers Who Learn," Management Science, INFORMS, vol. 60(1), pages 110-129, January.
Mohammed Shahid Abdulla & Shalabh Bhatnagar, 2016. "Multi-armed bandits based on a variant of Simulated Annealing," Indian Journal of Pure and Applied Mathematics, Springer, vol. 47(2), pages 195-212, June.
Kris Johnson Ferreira & Joel Goh, 2021. "Assortment Rotation and the Value of Concealment," Management Science, INFORMS, vol. 67(3), pages 1489-1507, March.

Most related items

These are the items that most often cite the same works as this one and are cited by the same works as this one.

David B. Brown & James E. Smith, 2020. "Index Policies and Performance Bounds for Dynamic Selection Problems," Management Science, INFORMS, vol. 66(7), pages 3029-3050, July.
Santiago R. Balseiro & David B. Brown & Chen Chen, 2021. "Dynamic Pricing of Relocating Resources in Large Networks," Management Science, INFORMS, vol. 67(7), pages 4075-4094, July.
Jacko, Peter & Niño Mora, José, 2009. "An index for dynamic product promotion and the knapsack problem for perishable items," DES - Working Papers. Statistics and Econometrics. WS ws093111, Universidad Carlos III de Madrid. Departamento de EstadÃstica.
Stephen Chick & Martin Forster & Paolo Pertile, 2017. "A Bayesian decision theoretic model of sequential experimentation with delayed response," Journal of the Royal Statistical Society Series B, Royal Statistical Society, vol. 79(5), pages 1439-1462, November.
- Stephen Chick & Martin Forster & Paolo Pertile, 2015. "A Bayesian Decision-Theoretic Model of Sequential Experimentation with Delayed Response," Discussion Papers 15/09, Department of Economics, University of York.
Malekipirbazari, Milad, 2025. "Optimizing sequential decision-making under risk: Strategic allocation with switching penalties," European Journal of Operational Research, Elsevier, vol. 321(1), pages 160-176.
Elodie Adida & Georgia Perakis, 2010. "Dynamic pricing and inventory control: robust vs. stochastic uncertainty models—a computational study," Annals of Operations Research, Springer, vol. 181(1), pages 125-157, December.
Yiangos Papanastasiou & Kostas Bimpikis & Nicos Savva, 2018. "Crowdsourcing Exploration," Management Science, INFORMS, vol. 64(4), pages 1727-1746, April.
Hao Zhang, 2022. "Analytical Solution to a Discrete-Time Model for Dynamic Learning and Decision Making," Management Science, INFORMS, vol. 68(8), pages 5924-5957, August.
Will Ma, 2018. "Improvements and Generalizations of Stochastic Knapsack and Markovian Bandits Approximation Algorithms," Mathematics of Operations Research, INFORMS, vol. 43(3), pages 789-812, August.
Nicolás Aramayo & Mario Schiappacasse & Marcel Goic, 2023. "A Multiarmed Bandit Approach for House Ads Recommendations," Marketing Science, INFORMS, vol. 42(2), pages 271-292, March.
Vishal Ahuja & John R. Birge, 2020. "An Approximation Approach for Response-Adaptive Clinical Trial Design," INFORMS Journal on Computing, INFORMS, vol. 32(4), pages 877-894, October.
José Niño-Mora, 2023. "Markovian Restless Bandits and Index Policies: A Review," Mathematics, MDPI, vol. 11(7), pages 1-27, March.
Onesun Steve Yoo & Kevin McCardle, 2020. "The Valuator’s Curse: Decision Analysis of Overvaluation and Disappointment in Acquisition," Decision Analysis, INFORMS, vol. 17(4), pages 299-313, December.
Ahuja, Vishal & Birge, John R., 2016. "Response-adaptive designs for clinical trials: Simultaneous learning from multiple patients," European Journal of Operational Research, Elsevier, vol. 248(2), pages 619-633.
José Niño-Mora, 2006. "Restless Bandit Marginal Productivity Indices, Diminishing Returns, and Optimal Control of Make-to-Order/Make-to-Stock M/G/1 Queues," Mathematics of Operations Research, INFORMS, vol. 31(1), pages 50-84, February.
Qi Chen & Qi Xu & Wenjie Wang, 2019. "Optimal Policies for the Pricing and Replenishment of Fashion Apparel considering the Effect of Fashion Level," Complexity, Hindawi, vol. 2019, pages 1-12, February.
Bayliss, Christopher & Currie, Christine S.M. & Bennell, Julia A. & Martinez-Sykora, Antonio, 2021. "Queue-constrained packing: A vehicle ferry case study," European Journal of Operational Research, Elsevier, vol. 289(2), pages 727-741.
Deligiannis, Michalis & Liberopoulos, George, 2023. "Dynamic ordering and buyer selection policies when service affects future demand," Omega, Elsevier, vol. 118(C).
Kohei Kawaguchi, 2021. "When Will Workers Follow an Algorithm? A Field Experiment with a Retail Business," Management Science, INFORMS, vol. 67(3), pages 1670-1695, March.
Martin Skutella & Maxim Sviridenko & Marc Uetz, 2016. "Unrelated Machine Scheduling with Stochastic Processing Times," Mathematics of Operations Research, INFORMS, vol. 41(3), pages 851-864, August.

More about this item

Keywords

dynamic programming/optimal control; multiarmed bandit problem; production/scheduling; learning; sequencing; stochastic;
All these keywords.

Statistics

Access and download statistics

Corrections

All material on this site has been provided by the respective publishers and authors. You can help correct errors and omissions. When requesting a correction, please mention this item's handle: RePEc:inm:oropre:v:59:y:2011:i:2:p:383-399. See general information about how to correct material in RePEc.

If you have authored this item and are not yet registered with RePEc, we encourage you to do it here. This allows to link your profile to this item. It also allows you to accept potential citations to this item that we are uncertain about.

If CitEc recognized a bibliographic reference but did not link an item in RePEc to it, you can help with this form .

If you know of missing items citing this one, you can help us creating those links by adding the relevant references in the same way as above, for each refering item. If you are a registered author of this item, you may also want to check the "citations" tab in your RePEc Author Service profile, as there may be some citations waiting for confirmation.

For technical questions regarding this item, or to correct its authors, title, abstract, bibliographic or download information, contact: Chris Asher (email available below). General contact details of provider: https://edirc.repec.org/data/inforea.html .

Please note that corrections may take a couple of weeks to filter through the various RePEc services.

IDEAS is a RePEc service. RePEc uses bibliographic data supplied by the respective publishers.

Browse Econ Literature

More features

The Irrevocable Multiarmed Bandit Problem

Author

Abstract

Suggested Citation

Download full text from publisher

References listed on IDEAS

Citations

Most related items

More about this item

Keywords

Statistics

Corrections

More services and features

MyIDEAS

Author registration

Rankings

RePEc Genealogy

RePEc Biblio

MPRA

New papers by email

EconAcademics

Plagiarism

About RePEc

RePEc home

Blog

Help/FAQ

RePEc team

Participating archives

Privacy statement

Help us

Corrections

Volunteers

Get papers listed

Open a RePEc archive

Get RePEc data