IDEAS home Printed from https://ideas.repec.org/a/gam/jmathe/v8y2020i12p2226-d462361.html
   My bibliography  Save this article

A Fast-Pivoting Algorithm for Whittle’s Restless Bandit Index

Author

Listed:
  • José Niño-Mora

    (Department of Statistics, Carlos III University of Madrid, 28903 Getafe, Spain)

Abstract

The Whittle index for restless bandits (two-action semi-Markov decision processes) provides an intuitively appealing optimal policy for controlling a single generic project that can be active (engaged) or passive (rested) at each decision epoch, and which can change state while passive. It further provides a practical heuristic priority-index policy for the computationally intractable multi-armed restless bandit problem, which has been widely applied over the last three decades in multifarious settings, yet mostly restricted to project models with a one-dimensional state. This is due in part to the difficulty of establishing indexability (existence of the index) and of computing the index for projects with large state spaces. This paper draws on the author’s prior results on sufficient indexability conditions and an adaptive-greedy algorithmic scheme for restless bandits to obtain a new fast-pivoting algorithm that computes the n Whittle index values of an n -state restless bandit by performing, after an initialization stage, n steps that entail ( 2 / 3 ) n 3 + O ( n 2 ) arithmetic operations. This algorithm also draws on the parametric simplex method, and is based on elucidating the pattern of parametric simplex tableaux, which allows to exploit special structure to substantially simplify and reduce the complexity of simplex pivoting steps. A numerical study demonstrates substantial runtime speed-ups versus alternative algorithms.

Suggested Citation

  • José Niño-Mora, 2020. "A Fast-Pivoting Algorithm for Whittle’s Restless Bandit Index," Mathematics, MDPI, vol. 8(12), pages 1-21, December.
  • Handle: RePEc:gam:jmathe:v:8:y:2020:i:12:p:2226-:d:462361
    as

    Download full text from publisher

    File URL: https://www.mdpi.com/2227-7390/8/12/2226/pdf
    Download Restriction: no

    File URL: https://www.mdpi.com/2227-7390/8/12/2226/
    Download Restriction: no
    ---><---

    References listed on IDEAS

    as
    1. Dong Li & Li Ding & Stephen Connor, 2020. "When to Switch? Index Policies for Resource Scheduling in Emergency Response," Production and Operations Management, Production and Operations Management Society, vol. 29(2), pages 241-262, February.
    2. Sonin, Isaac M., 2008. "A generalized Gittins index for a Markov chain and its recursive calculation," Statistics & Probability Letters, Elsevier, vol. 78(12), pages 1526-1533, September.
    3. Bernardo A. Huberman & Fang Wu, 2008. "The Economics Of Attention: Maximizing User Value In Information-Rich Environments," Advances in Complex Systems (ACS), World Scientific Publishing Co. Pte. Ltd., vol. 11(04), pages 487-496.
    4. Abderrahmane Abbou & Viliam Makis, 2019. "Group Maintenance: A Restless Bandits Approach," INFORMS Journal on Computing, INFORMS, vol. 31(4), pages 719-731, October.
    5. Christos H. Papadimitriou & John N. Tsitsiklis, 1999. "The Complexity of Optimal Queuing Network Control," Mathematics of Operations Research, INFORMS, vol. 24(2), pages 293-305, May.
    6. Saul Gass & Thomas Saaty, 1955. "The computational algorithm for the parametric objective function," Naval Research Logistics Quarterly, John Wiley & Sons, vol. 2(1‐2), pages 39-45, March.
    7. Yih Ren Chen & Michael N. Katehakis, 1986. "Linear Programming for Finite State Multi-Armed Bandit Problems," Mathematics of Operations Research, INFORMS, vol. 11(1), pages 180-183, February.
    8. Dimitris Bertsimas & José Niño-Mora, 1996. "Conservation Laws, Extended Polymatroids and Multiarmed Bandit Problems; A Polyhedral Approach to Indexable Systems," Mathematics of Operations Research, INFORMS, vol. 21(2), pages 257-306, May.
    9. Michael N. Katehakis & Arthur F. Veinott, 1987. "The Multi-Armed Bandit Problem: Decomposition and Computation," Mathematics of Operations Research, INFORMS, vol. 12(2), pages 262-268, May.
    10. Turgay Ayer & Can Zhang & Anthony Bonifonte & Anne C. Spaulding & Jagpreet Chhatwal, 2019. "Prioritizing Hepatitis C Treatment in U.S. Prisons," Operations Research, INFORMS, vol. 67(3), pages 853-873, May.
    Full references (including those not matched with items on IDEAS)

    Most related items

    These are the items that most often cite the same works as this one and are cited by the same works as this one.
    1. José Niño-Mora, 2020. "Fast Two-Stage Computation of an Index Policy for Multi-Armed Bandits with Setup Delays," Mathematics, MDPI, vol. 9(1), pages 1-36, December.
    2. Urtzi Ayesta & Manu K. Gupta & Ina Maria Verloop, 2021. "On the computation of Whittle’s index for Markovian restless bandits," Mathematical Methods of Operations Research, Springer;Gesellschaft für Operations Research (GOR);Nederlands Genootschap voor Besliskunde (NGB), vol. 93(1), pages 179-208, February.
    3. José Niño-Mora, 2020. "A Verification Theorem for Threshold-Indexability of Real-State Discounted Restless Bandits," Mathematics of Operations Research, INFORMS, vol. 45(2), pages 465-496, May.
    4. Nicolas Gast & Bruno Gaujal & Kimang Khun, 2023. "Testing indexability and computing Whittle and Gittins index in subcubic time," Mathematical Methods of Operations Research, Springer;Gesellschaft für Operations Research (GOR);Nederlands Genootschap voor Besliskunde (NGB), vol. 97(3), pages 391-436, June.
    5. Ya‐Tang Chuang & Manaf Zargoush & Somayeh Ghazalbash & Saied Samiedaluie & Kerry Kuluski & Sara Guilcher, 2023. "From prediction to decision: Optimizing long‐term care placements among older delayed discharge patients," Production and Operations Management, Production and Operations Management Society, vol. 32(4), pages 1041-1058, April.
    6. K. D. Glazebrook & R. Minty, 2009. "A Generalized Gittins Index for a Class of Multiarmed Bandits with General Resource Requirements," Mathematics of Operations Research, INFORMS, vol. 34(1), pages 26-44, February.
    7. José Niño-Mora, 2022. "Multi-Gear Bandits, Partial Conservation Laws, and Indexability," Mathematics, MDPI, vol. 10(14), pages 1-31, July.
    8. Malekipirbazari, Milad & Çavuş, Özlem, 2024. "Index policy for multiarmed bandit problem with dynamic risk measures," European Journal of Operational Research, Elsevier, vol. 312(2), pages 627-640.
    9. K. D. Glazebrook & C. Kirkbride & H. M. Mitchell & D. P. Gaver & P. A. Jacobs, 2007. "Index Policies for Shooting Problems," Operations Research, INFORMS, vol. 55(4), pages 769-781, August.
    10. José Niño-Mora, 2006. "Restless Bandit Marginal Productivity Indices, Diminishing Returns, and Optimal Control of Make-to-Order/Make-to-Stock M/G/1 Queues," Mathematics of Operations Research, INFORMS, vol. 31(1), pages 50-84, February.
    11. Lodewijk Kallenberg, 2013. "Derman’s book as inspiration: some results on LP for MDPs," Annals of Operations Research, Springer, vol. 208(1), pages 63-94, September.
    12. Esther Frostig & Gideon Weiss, 2016. "Four proofs of Gittins’ multiarmed bandit theorem," Annals of Operations Research, Springer, vol. 241(1), pages 127-165, June.
    13. R. T. Dunn & K. D. Glazebrook, 2004. "Discounted Multiarmed Bandit Problems on a Collection of Machines with Varying Speeds," Mathematics of Operations Research, INFORMS, vol. 29(2), pages 266-279, May.
    14. Dong Li & Li Ding & Stephen Connor, 2020. "When to Switch? Index Policies for Resource Scheduling in Emergency Response," Production and Operations Management, Production and Operations Management Society, vol. 29(2), pages 241-262, February.
    15. Rob Shone & Vincent A. Knight & Paul R. Harper, 2020. "A conservative index heuristic for routing problems with multiple heterogeneous service facilities," Mathematical Methods of Operations Research, Springer;Gesellschaft für Operations Research (GOR);Nederlands Genootschap voor Besliskunde (NGB), vol. 92(3), pages 511-543, December.
    16. Vishal Ahuja & John R. Birge, 2020. "An Approximation Approach for Response-Adaptive Clinical Trial Design," INFORMS Journal on Computing, INFORMS, vol. 32(4), pages 877-894, October.
    17. José Niño-Mora, 2023. "Markovian Restless Bandits and Index Policies: A Review," Mathematics, MDPI, vol. 11(7), pages 1-27, March.
    18. José Niño-Mora, 2007. "A (2/3) n 3 Fast-Pivoting Algorithm for the Gittins Index and Optimal Stopping of a Markov Chain," INFORMS Journal on Computing, INFORMS, vol. 19(4), pages 596-606, November.
    19. Ece Zeliha Demirci & Joachim Arts & Geert-Jan Van Houtum, 2022. "A restless bandit approach for capacitated condition based maintenance scheduling," DEM Discussion Paper Series 22-01, Department of Economics at the University of Luxembourg.
    20. Isaac M. Sonin & Constantine Steinberg, 2016. "Continue, quit, restart probability model," Annals of Operations Research, Springer, vol. 241(1), pages 295-318, June.

    Corrections

    All material on this site has been provided by the respective publishers and authors. You can help correct errors and omissions. When requesting a correction, please mention this item's handle: RePEc:gam:jmathe:v:8:y:2020:i:12:p:2226-:d:462361. See general information about how to correct material in RePEc.

    If you have authored this item and are not yet registered with RePEc, we encourage you to do it here. This allows to link your profile to this item. It also allows you to accept potential citations to this item that we are uncertain about.

    If CitEc recognized a bibliographic reference but did not link an item in RePEc to it, you can help with this form .

    If you know of missing items citing this one, you can help us creating those links by adding the relevant references in the same way as above, for each refering item. If you are a registered author of this item, you may also want to check the "citations" tab in your RePEc Author Service profile, as there may be some citations waiting for confirmation.

    For technical questions regarding this item, or to correct its authors, title, abstract, bibliographic or download information, contact: MDPI Indexing Manager (email available below). General contact details of provider: https://www.mdpi.com .

    Please note that corrections may take a couple of weeks to filter through the various RePEc services.

    IDEAS is a RePEc service. RePEc uses bibliographic data supplied by the respective publishers.