IDEAS home Printed from https://ideas.repec.org/a/gam/jmathe/v9y2020i1p52-d469887.html
   My bibliography  Save this article

Fast Two-Stage Computation of an Index Policy for Multi-Armed Bandits with Setup Delays

Author

Listed:
  • José Niño-Mora

    (Department of Statistics, Carlos III University of Madrid, 28903 Getafe, Spain)

Abstract

We consider the multi-armed bandit problem with penalties for switching that include setup delays and costs, extending the former results of the author for the special case with no switching delays. A priority index for projects with setup delays that characterizes, in part, optimal policies was introduced by Asawa and Teneketzis in 1996, yet without giving a means of computing it. We present a fast two-stage index computing method, which computes the continuation index (which applies when the project has been set up) in a first stage and certain extra quantities with cubic (arithmetic-operation) complexity in the number of project states and then computes the switching index (which applies when the project is not set up), in a second stage, with quadratic complexity. The approach is based on new methodological advances on restless bandit indexation, which are introduced and deployed herein, being motivated by the limitations of previous results, exploiting the fact that the aforementioned index is the Whittle index of the project in its restless reformulation. A numerical study demonstrates substantial runtime speed-ups of the new two-stage index algorithm versus a general one-stage Whittle index algorithm. The study further gives evidence that, in a multi-project setting, the index policy is consistently nearly optimal.

Suggested Citation

  • José Niño-Mora, 2020. "Fast Two-Stage Computation of an Index Policy for Multi-Armed Bandits with Setup Delays," Mathematics, MDPI, vol. 9(1), pages 1-36, December.
  • Handle: RePEc:gam:jmathe:v:9:y:2020:i:1:p:52-:d:469887
    as

    Download full text from publisher

    File URL: https://www.mdpi.com/2227-7390/9/1/52/pdf
    Download Restriction: no

    File URL: https://www.mdpi.com/2227-7390/9/1/52/
    Download Restriction: no
    ---><---

    References listed on IDEAS

    as
    1. Dong Li & Li Ding & Stephen Connor, 2020. "When to Switch? Index Policies for Resource Scheduling in Emergency Response," Production and Operations Management, Production and Operations Management Society, vol. 29(2), pages 241-262, February.
    2. Song Lin & Juanjuan Zhang & John R. Hauser, 2015. "Learning from Experience, Simply," Marketing Science, INFORMS, vol. 34(1), pages 1-19, January.
    3. Banks, Jeffrey S & Sundaram, Rangarajan K, 1994. "Switching Costs and the Gittins Index," Econometrica, Econometric Society, vol. 62(3), pages 687-694, May.
    4. Bergemann, Dirk & Valimaki, Juuso, 2001. "Stationary multi-choice bandit problems," Journal of Economic Dynamics and Control, Elsevier, vol. 25(10), pages 1585-1594, October.
    5. Rangarajan K. Sundaram, 2005. "Generalized Bandit Problems," Studies in Choice and Welfare, in: David Austen-Smith & John Duggan (ed.), Social Choice and Strategic Decisions, pages 131-162, Springer.
    6. Abderrahmane Abbou & Viliam Makis, 2019. "Group Maintenance: A Restless Bandits Approach," INFORMS Journal on Computing, INFORMS, vol. 31(4), pages 719-731, October.
    7. Christos H. Papadimitriou & John N. Tsitsiklis, 1999. "The Complexity of Optimal Queuing Network Control," Mathematics of Operations Research, INFORMS, vol. 24(2), pages 293-305, May.
    8. John R. Hauser & Guilherme (Gui) Liberali & Glen L. Urban, 2014. "Website Morphing 2.0: Switching Costs, Partial Exposure, Random Exit, and When to Morph," Management Science, INFORMS, vol. 60(6), pages 1594-1616, June.
    9. A. J. Mason & E. J. Anderson, 1991. "Minimizing flow time on a single machine with job classes and setup times," Naval Research Logistics (NRL), John Wiley & Sons, vol. 38(3), pages 333-350, June.
    10. David Yao, 2007. "Comments on: Dynamic priority allocation via restless bandit marginal productivity indices," TOP: An Official Journal of the Spanish Society of Statistics and Operations Research, Springer;Sociedad de Estadística e Investigación Operativa, vol. 15(2), pages 220-223, December.
    11. Alessandro Arlotto & Stephen E. Chick & Noah Gans, 2014. "Optimal Hiring and Retention Policies for Heterogeneous Workers Who Learn," Management Science, INFORMS, vol. 60(1), pages 110-129, January.
    12. Dimitris Bertsimas & José Niño-Mora, 1996. "Conservation Laws, Extended Polymatroids and Multiarmed Bandit Problems; A Polyhedral Approach to Indexable Systems," Mathematics of Operations Research, INFORMS, vol. 21(2), pages 257-306, May.
    13. Turgay Ayer & Can Zhang & Anthony Bonifonte & Anne C. Spaulding & Jagpreet Chhatwal, 2019. "Prioritizing Hepatitis C Treatment in U.S. Prisons," Operations Research, INFORMS, vol. 67(3), pages 853-873, May.
    Full references (including those not matched with items on IDEAS)

    Most related items

    These are the items that most often cite the same works as this one and are cited by the same works as this one.
    1. José Niño-Mora, 2020. "A Fast-Pivoting Algorithm for Whittle’s Restless Bandit Index," Mathematics, MDPI, vol. 8(12), pages 1-21, December.
    2. Urtzi Ayesta & Manu K. Gupta & Ina Maria Verloop, 2021. "On the computation of Whittle’s index for Markovian restless bandits," Mathematical Methods of Operations Research, Springer;Gesellschaft für Operations Research (GOR);Nederlands Genootschap voor Besliskunde (NGB), vol. 93(1), pages 179-208, February.
    3. Ya‐Tang Chuang & Manaf Zargoush & Somayeh Ghazalbash & Saied Samiedaluie & Kerry Kuluski & Sara Guilcher, 2023. "From prediction to decision: Optimizing long‐term care placements among older delayed discharge patients," Production and Operations Management, Production and Operations Management Society, vol. 32(4), pages 1041-1058, April.
    4. Alessandro Arlotto & Stephen E. Chick & Noah Gans, 2014. "Optimal Hiring and Retention Policies for Heterogeneous Workers Who Learn," Management Science, INFORMS, vol. 60(1), pages 110-129, January.
    5. K. D. Glazebrook & R. Minty, 2009. "A Generalized Gittins Index for a Class of Multiarmed Bandits with General Resource Requirements," Mathematics of Operations Research, INFORMS, vol. 34(1), pages 26-44, February.
    6. José Niño-Mora, 2023. "Markovian Restless Bandits and Index Policies: A Review," Mathematics, MDPI, vol. 11(7), pages 1-27, March.
    7. José Niño-Mora, 2022. "Multi-Gear Bandits, Partial Conservation Laws, and Indexability," Mathematics, MDPI, vol. 10(14), pages 1-31, July.
    8. José Niño-Mora, 2006. "Restless Bandit Marginal Productivity Indices, Diminishing Returns, and Optimal Control of Make-to-Order/Make-to-Stock M/G/1 Queues," Mathematics of Operations Research, INFORMS, vol. 31(1), pages 50-84, February.
    9. José Niño-Mora, 2020. "A Verification Theorem for Threshold-Indexability of Real-State Discounted Restless Bandits," Mathematics of Operations Research, INFORMS, vol. 45(2), pages 465-496, May.
    10. Gui Liberali & Alina Ferecatu, 2022. "Morphing for Consumer Dynamics: Bandits Meet Hidden Markov Models," Marketing Science, INFORMS, vol. 41(4), pages 769-794, July.
    11. Song Lin & Juanjuan Zhang & John R. Hauser, 2015. "Learning from Experience, Simply," Marketing Science, INFORMS, vol. 34(1), pages 1-19, January.
    12. Forand, Jean Guillaume, 2015. "Keeping your options open," Journal of Economic Dynamics and Control, Elsevier, vol. 53(C), pages 47-68.
    13. Dong Li & Li Ding & Stephen Connor, 2020. "When to Switch? Index Policies for Resource Scheduling in Emergency Response," Production and Operations Management, Production and Operations Management Society, vol. 29(2), pages 241-262, February.
    14. Rob Shone & Vincent A. Knight & Paul R. Harper, 2020. "A conservative index heuristic for routing problems with multiple heterogeneous service facilities," Mathematical Methods of Operations Research, Springer;Gesellschaft für Operations Research (GOR);Nederlands Genootschap voor Besliskunde (NGB), vol. 92(3), pages 511-543, December.
    15. Haijian Si & Stylianos Kavadias & Christoph Loch, 2022. "Managing innovation portfolios: From project selection to portfolio design," Production and Operations Management, Production and Operations Management Society, vol. 31(12), pages 4572-4588, December.
    16. Ece Zeliha Demirci & Joachim Arts & Geert-Jan Van Houtum, 2022. "A restless bandit approach for capacitated condition based maintenance scheduling," DEM Discussion Paper Series 22-01, Department of Economics at the University of Luxembourg.
    17. K. D. Glazebrook & C. Kirkbride & J. Ouenniche, 2009. "Index Policies for the Admission Control and Routing of Impatient Customers to Heterogeneous Service Stations," Operations Research, INFORMS, vol. 57(4), pages 975-989, August.
    18. Keller, Godfrey & Oldale, Alison, 2003. "Branching bandits: a sequential search process with correlated pay-offs," Journal of Economic Theory, Elsevier, vol. 113(2), pages 302-315, December.
    19. Dimitris Bertsimas & José Niño-Mora, 2000. "Restless Bandits, Linear Programming Relaxations, and a Primal-Dual Index Heuristic," Operations Research, INFORMS, vol. 48(1), pages 80-90, February.
    20. K. D. Glazebrook & C. Kirkbride & H. M. Mitchell & D. P. Gaver & P. A. Jacobs, 2007. "Index Policies for Shooting Problems," Operations Research, INFORMS, vol. 55(4), pages 769-781, August.

    Corrections

    All material on this site has been provided by the respective publishers and authors. You can help correct errors and omissions. When requesting a correction, please mention this item's handle: RePEc:gam:jmathe:v:9:y:2020:i:1:p:52-:d:469887. See general information about how to correct material in RePEc.

    If you have authored this item and are not yet registered with RePEc, we encourage you to do it here. This allows to link your profile to this item. It also allows you to accept potential citations to this item that we are uncertain about.

    If CitEc recognized a bibliographic reference but did not link an item in RePEc to it, you can help with this form .

    If you know of missing items citing this one, you can help us creating those links by adding the relevant references in the same way as above, for each refering item. If you are a registered author of this item, you may also want to check the "citations" tab in your RePEc Author Service profile, as there may be some citations waiting for confirmation.

    For technical questions regarding this item, or to correct its authors, title, abstract, bibliographic or download information, contact: MDPI Indexing Manager (email available below). General contact details of provider: https://www.mdpi.com .

    Please note that corrections may take a couple of weeks to filter through the various RePEc services.

    IDEAS is a RePEc service. RePEc uses bibliographic data supplied by the respective publishers.