IDEAS home Printed from https://ideas.repec.org/a/spr/queues/v102y2022i3d10.1007_s11134-022-09737-y.html
   My bibliography  Save this article

On the Whittle index of Markov modulated restless bandits

Author

Listed:
  • S. Duran

    (CNRS
    Université de Toulouse
    CNRS)

  • U. Ayesta

    (CNRS
    Université de Toulouse
    IKERBASQUE - Basque Foundation for Science
    UPV/EHU, University of the Basque Country)

  • I. M. Verloop

    (CNRS
    Université de Toulouse)

Abstract

In this paper, we study a Multi-Armed Restless Bandit Problem (MARBP) subject to time fluctuations. This model has numerous applications in practice, like in cloud computing systems or in wireless communications networks. Each bandit is formed by two processes: a controllable process and an environment. The transition rates of the controllable process are determined by the state of the environment, which is an exogenous Markov process. The decision maker has full information on the state of every bandit, and the objective is to determine the optimal policy that minimises the long-run average cost. Given the complexity of the problem, we set out to characterise the Whittle index, which is obtained by solving a relaxed version of the MARBP. As reported in the literature, this heuristic performs extremely well for a wide variety of problems. Assuming that the optimal policy of the relaxed problem is of threshold type, we provide an algorithm that finds Whittle’s index. We then consider a multi-class queue with linear cost and impatient customers. For this model, we show threshold optimality, prove indexability, and obtain Whittle’s index in closed-form. We also study the limiting regimes in which the environment is relatively slower and faster than the controllable process. By numerical simulations, we assess the suboptimality of Whittle’s index policy in a wide variety of scenarios, and the general observation is that, as in the case of standard MARBP, the suboptimality gap of Whittle’s index policy is small.

Suggested Citation

  • S. Duran & U. Ayesta & I. M. Verloop, 2022. "On the Whittle index of Markov modulated restless bandits," Queueing Systems: Theory and Applications, Springer, vol. 102(3), pages 373-430, December.
  • Handle: RePEc:spr:queues:v:102:y:2022:i:3:d:10.1007_s11134-022-09737-y
    DOI: 10.1007/s11134-022-09737-y
    as

    Download full text from publisher

    File URL: http://link.springer.com/10.1007/s11134-022-09737-y
    File Function: Abstract
    Download Restriction: Access to the full text of the articles in this series is restricted.

    File URL: https://libkey.io/10.1007/s11134-022-09737-y?utm_source=ideas
    LibKey link: if access is restricted and if your library uses this service, LibKey will redirect you to where you can use your library subscription to access this item
    ---><---

    As the access to this document is restricted, you may want to search for a different version of it.

    References listed on IDEAS

    as
    1. Glazebrook, K. D. & Mitchell, H. M. & Ansell, P. S., 2005. "Index policies for the maintenance of a collection of machines by a set of repairmen," European Journal of Operational Research, Elsevier, vol. 165(1), pages 267-284, August.
    2. P. S. Ansell & K. D. Glazebrook & J. Niño-Mora & M. O'Keeffe, 2003. "Whittle's index policy for a multi-class queueing system with convex holding costs," Mathematical Methods of Operations Research, Springer;Gesellschaft für Operations Research (GOR);Nederlands Genootschap voor Besliskunde (NGB), vol. 57(1), pages 21-39, April.
    3. van Dijk, Nico M., 1992. "Approximate uniformization for continuous-time Markov chains with an application to performability analysis," Stochastic Processes and their Applications, Elsevier, vol. 40(2), pages 339-357, March.
    4. K. D. Glazebrook & C. Kirkbride & J. Ouenniche, 2009. "Index Policies for the Admission Control and Routing of Impatient Customers to Heterogeneous Service Stations," Operations Research, INFORMS, vol. 57(4), pages 975-989, August.
    Full references (including those not matched with items on IDEAS)

    Citations

    Citations are extracted by the CitEc Project, subscribe to its RSS feed for this item.
    as


    Cited by:

    1. Li, Xiao & Li, Yuqiang & Wu, Xianyi, 2023. "Empirical Gittins index strategies with ε-explorations for multi-armed bandit problems," Computational Statistics & Data Analysis, Elsevier, vol. 180(C).

    Most related items

    These are the items that most often cite the same works as this one and are cited by the same works as this one.
    1. Urtzi Ayesta & Manu K. Gupta & Ina Maria Verloop, 2021. "On the computation of Whittle’s index for Markovian restless bandits," Mathematical Methods of Operations Research, Springer;Gesellschaft für Operations Research (GOR);Nederlands Genootschap voor Besliskunde (NGB), vol. 93(1), pages 179-208, February.
    2. Ece Zeliha Demirci & Joachim Arts & Geert-Jan Van Houtum, 2022. "A restless bandit approach for capacitated condition based maintenance scheduling," DEM Discussion Paper Series 22-01, Department of Economics at the University of Luxembourg.
    3. Turgay Ayer & Can Zhang & Anthony Bonifonte & Anne C. Spaulding & Jagpreet Chhatwal, 2019. "Prioritizing Hepatitis C Treatment in U.S. Prisons," Operations Research, INFORMS, vol. 67(3), pages 853-873, May.
    4. Nicolas Gast & Bruno Gaujal & Kimang Khun, 2023. "Testing indexability and computing Whittle and Gittins index in subcubic time," Mathematical Methods of Operations Research, Springer;Gesellschaft für Operations Research (GOR);Nederlands Genootschap voor Besliskunde (NGB), vol. 97(3), pages 391-436, June.
    5. T. W. Archibald & D. P. Black & K. D. Glazebrook, 2009. "Indexability and Index Heuristics for a Simple Class of Inventory Routing Problems," Operations Research, INFORMS, vol. 57(2), pages 314-326, April.
    6. José Niño-Mora, 2006. "Restless Bandit Marginal Productivity Indices, Diminishing Returns, and Optimal Control of Make-to-Order/Make-to-Stock M/G/1 Queues," Mathematics of Operations Research, INFORMS, vol. 31(1), pages 50-84, February.
    7. Samuli Aalto & Ziv Scully, 2023. "Minimizing the mean slowdown in the M/G/1 queue," Queueing Systems: Theory and Applications, Springer, vol. 104(3), pages 187-210, August.
    8. Andrei Sleptchenko & M. Eric Johnson, 2015. "Maintaining Secure and Reliable Distributed Control Systems," INFORMS Journal on Computing, INFORMS, vol. 27(1), pages 103-117, February.
    9. José Niño-Mora, 2020. "A Verification Theorem for Threshold-Indexability of Real-State Discounted Restless Bandits," Mathematics of Operations Research, INFORMS, vol. 45(2), pages 465-496, May.
    10. Vivek S. Borkar & Sarath Pattathil, 2022. "Whittle indexability in egalitarian processor sharing systems," Annals of Operations Research, Springer, vol. 317(2), pages 417-437, October.
    11. Urtzi Ayesta & M Erausquin & E Ferreira & P Jacko, 2016. "Optimal Dynamic Resource Allocation to Prevent Defaults," Working Papers hal-01300681, HAL.
    12. Dong Li & Li Ding & Stephen Connor, 2020. "When to Switch? Index Policies for Resource Scheduling in Emergency Response," Production and Operations Management, Production and Operations Management Society, vol. 29(2), pages 241-262, February.
    13. Ford, Stephen & Atkinson, Michael P. & Glazebrook, Kevin & Jacko, Peter, 2020. "On the dynamic allocation of assets subject to failure," European Journal of Operational Research, Elsevier, vol. 284(1), pages 227-239.
    14. Philip Cho & Vivek Farias & John Kessler & Retsef Levi & Thomas Magnanti & Eric Zarybnisky, 2015. "Maintenance and flight scheduling of low observable aircraft," Naval Research Logistics (NRL), John Wiley & Sons, vol. 62(1), pages 60-80, February.
    15. Rob Shone & Vincent A. Knight & Paul R. Harper, 2020. "A conservative index heuristic for routing problems with multiple heterogeneous service facilities," Mathematical Methods of Operations Research, Springer;Gesellschaft für Operations Research (GOR);Nederlands Genootschap voor Besliskunde (NGB), vol. 92(3), pages 511-543, December.
    16. David B. Brown & Martin B. Haugh, 2017. "Information Relaxation Bounds for Infinite Horizon Markov Decision Processes," Operations Research, INFORMS, vol. 65(5), pages 1355-1379, October.
    17. L Ding & K D Glazebrook, 2005. "A static allocation model for the outsourcing of warranty repairs," Journal of the Operational Research Society, Palgrave Macmillan;The OR Society, vol. 56(7), pages 825-835, July.
    18. K. D. Glazebrook & R. Minty, 2009. "A Generalized Gittins Index for a Class of Multiarmed Bandits with General Resource Requirements," Mathematics of Operations Research, INFORMS, vol. 34(1), pages 26-44, February.
    19. José Niño-Mora, 2023. "Markovian Restless Bandits and Index Policies: A Review," Mathematics, MDPI, vol. 11(7), pages 1-27, March.
    20. Dong Li & Kevin D. Glazebrook, 2010. "An approximate dynamic programing approach to the development of heuristics for the scheduling of impatient jobs in a clearing system," Naval Research Logistics (NRL), John Wiley & Sons, vol. 57(3), pages 225-236, April.

    Corrections

    All material on this site has been provided by the respective publishers and authors. You can help correct errors and omissions. When requesting a correction, please mention this item's handle: RePEc:spr:queues:v:102:y:2022:i:3:d:10.1007_s11134-022-09737-y. See general information about how to correct material in RePEc.

    If you have authored this item and are not yet registered with RePEc, we encourage you to do it here. This allows to link your profile to this item. It also allows you to accept potential citations to this item that we are uncertain about.

    If CitEc recognized a bibliographic reference but did not link an item in RePEc to it, you can help with this form .

    If you know of missing items citing this one, you can help us creating those links by adding the relevant references in the same way as above, for each refering item. If you are a registered author of this item, you may also want to check the "citations" tab in your RePEc Author Service profile, as there may be some citations waiting for confirmation.

    For technical questions regarding this item, or to correct its authors, title, abstract, bibliographic or download information, contact: Sonal Shukla or Springer Nature Abstracting and Indexing (email available below). General contact details of provider: http://www.springer.com .

    Please note that corrections may take a couple of weeks to filter through the various RePEc services.

    IDEAS is a RePEc service. RePEc uses bibliographic data supplied by the respective publishers.