IDEAS home Printed from https://ideas.repec.org/a/inm/oropre/v48y2000i1p80-90.html
   My bibliography  Save this article

Restless Bandits, Linear Programming Relaxations, and a Primal-Dual Index Heuristic

Author

Listed:
  • Dimitris Bertsimas

    (Sloan School of Management and Operations Research Center, Massachusetts Institute of Technology, Cambridge, MA 02139)

  • José Niño-Mora

    (Department of Economics and Business, Universitat Pompeu Fabra, E-08005 Barcelona, Spain)

Abstract

We develop a mathematical programming approach for the classical PSPACE-hard restless bandit problem in stochastic optimization. We introduce a hierarchy of N (where N is the number of bandits) increasingly stronger linear programming relaxations, the last of which is exact and corresponds to the (exponential size) formulation of the problem as a Markov decision chain, while the other relaxations provide bounds and are efficiently computed. We also propose a priority-index heuristic scheduling policy from the solution to the firstorder relaxation, where the indices are defined in terms of optimal dual variables. In this way we propose a policy and a suboptimality guarantee. We report results of computational experiments that suggest that the proposed heuristic policy is nearly optimal. Moreover, the second-order relaxation is found to provide strong bounds on the optimal value.

Suggested Citation

  • Dimitris Bertsimas & José Niño-Mora, 2000. "Restless Bandits, Linear Programming Relaxations, and a Primal-Dual Index Heuristic," Operations Research, INFORMS, vol. 48(1), pages 80-90, February.
  • Handle: RePEc:inm:oropre:v:48:y:2000:i:1:p:80-90
    DOI: 10.1287/opre.48.1.80.12444
    as

    Download full text from publisher

    File URL: http://dx.doi.org/10.1287/opre.48.1.80.12444
    Download Restriction: no

    File URL: https://libkey.io/10.1287/opre.48.1.80.12444?utm_source=ideas
    LibKey link: if access is restricted and if your library uses this service, LibKey will redirect you to where you can use your library subscription to access this item
    ---><---

    References listed on IDEAS

    as
    1. Christos H. Papadimitriou & John N. Tsitsiklis, 1999. "The Complexity of Optimal Queuing Network Control," Mathematics of Operations Research, INFORMS, vol. 24(2), pages 293-305, May.
    2. J. George Shanthikumar & David D. Yao, 1992. "Multiclass Queueing Systems: Polymatroidal Structure and Optimal Scheduling Control," Operations Research, INFORMS, vol. 40(3-supplem), pages 293-299, June.
    3. A. Federgruen & H. Groenevelt, 1988. "Characterization and Optimization of Achievable Performance in General Queueing Systems," Operations Research, INFORMS, vol. 36(5), pages 733-741, October.
    4. E. G. Coffman & I. Mitrani, 1980. "A Characterization of Waiting Time Performance Realizable by Single-Server Queues," Operations Research, INFORMS, vol. 28(3-part-ii), pages 810-821, June.
    5. Dimitris Bertsimas & José Niño-Mora, 1996. "Conservation Laws, Extended Polymatroids and Multiarmed Bandit Problems; A Polyhedral Approach to Indexable Systems," Mathematics of Operations Research, INFORMS, vol. 21(2), pages 257-306, May.
    Full references (including those not matched with items on IDEAS)

    Citations

    Citations are extracted by the CitEc Project, subscribe to its RSS feed for this item.
    as


    Cited by:

    1. José Niño-Mora, 2006. "Restless Bandit Marginal Productivity Indices, Diminishing Returns, and Optimal Control of Make-to-Order/Make-to-Stock M/G/1 Queues," Mathematics of Operations Research, INFORMS, vol. 31(1), pages 50-84, February.
    2. Peter Hulshof & Richard Boucherie & Erwin Hans & Johann Hurink, 2013. "Tactical resource allocation and elective patient admission planning in care processes," Health Care Management Science, Springer, vol. 16(2), pages 152-166, June.
    3. José Niño-Mora, 2023. "Markovian Restless Bandits and Index Policies: A Review," Mathematics, MDPI, vol. 11(7), pages 1-27, March.
    4. José Niño-Mora, 2000. "On certain greedoid polyhedra, partially indexable scheduling problems and extended restless bandit allocation indices," Economics Working Papers 456, Department of Economics and Business, Universitat Pompeu Fabra.
    5. David B. Brown & James E. Smith, 2020. "Index Policies and Performance Bounds for Dynamic Selection Problems," Management Science, INFORMS, vol. 66(7), pages 3029-3050, July.
    6. Andrei Sleptchenko & M. Eric Johnson, 2015. "Maintaining Secure and Reliable Distributed Control Systems," INFORMS Journal on Computing, INFORMS, vol. 27(1), pages 103-117, February.
    7. Sarang Deo & Seyed Iravani & Tingting Jiang & Karen Smilowitz & Stephen Samuelson, 2013. "Improving Health Outcomes Through Better Capacity Allocation in a Community-Based Chronic Care Model," Operations Research, INFORMS, vol. 61(6), pages 1277-1294, December.
    8. Ilya O. Ryzhov & Warren B. Powell & Peter I. Frazier, 2012. "The Knowledge Gradient Algorithm for a General Class of Online Learning Problems," Operations Research, INFORMS, vol. 60(1), pages 180-195, February.
    9. Elliot Lee & Mariel S. Lavieri & Michael Volk, 2019. "Optimal Screening for Hepatocellular Carcinoma: A Restless Bandit Model," Service Science, INFORMS, vol. 21(1), pages 198-212, January.
    10. Daniel Adelman, 2003. "Price-Directed Replenishment of Subsets: Methodology and Its Application to Inventory Routing," Manufacturing & Service Operations Management, INFORMS, vol. 5(4), pages 348-371, May.
    11. Song Lin & Juanjuan Zhang & John R. Hauser, 2015. "Learning from Experience, Simply," Marketing Science, INFORMS, vol. 34(1), pages 1-19, January.
    12. Dimitris Bertsimas & Velibor V. Mišić, 2016. "Decomposable Markov Decision Processes: A Fluid Optimization Approach," Operations Research, INFORMS, vol. 64(6), pages 1537-1555, December.
    13. Abderrahmane Abbou & Viliam Makis, 2019. "Group Maintenance: A Restless Bandits Approach," INFORMS Journal on Computing, INFORMS, vol. 31(4), pages 719-731, October.
    14. Vivek F. Farias & Ritesh Madan, 2011. "The Irrevocable Multiarmed Bandit Problem," Operations Research, INFORMS, vol. 59(2), pages 383-399, April.
    15. Glazebrook, K. D. & Mitchell, H. M. & Ansell, P. S., 2005. "Index policies for the maintenance of a collection of machines by a set of repairmen," European Journal of Operational Research, Elsevier, vol. 165(1), pages 267-284, August.
    16. Christoph H. Loch & Stylianos Kavadias, 2002. "Dynamic Portfolio Selection of NPD Programs Using Marginal Returns," Management Science, INFORMS, vol. 48(10), pages 1227-1241, October.
    17. Silviya Valeva & Guodong Pang & Andrew J. Schaefer & Gilles Clermont, 2023. "Acuity-Based Allocation of ICU-Downstream Beds with Flexible Staffing," INFORMS Journal on Computing, INFORMS, vol. 35(2), pages 403-422, March.
    18. Turgay Ayer & Can Zhang & Anthony Bonifonte & Anne C. Spaulding & Jagpreet Chhatwal, 2019. "Prioritizing Hepatitis C Treatment in U.S. Prisons," Operations Research, INFORMS, vol. 67(3), pages 853-873, May.

    Most related items

    These are the items that most often cite the same works as this one and are cited by the same works as this one.
    1. José Niño-Mora, 2006. "Restless Bandit Marginal Productivity Indices, Diminishing Returns, and Optimal Control of Make-to-Order/Make-to-Stock M/G/1 Queues," Mathematics of Operations Research, INFORMS, vol. 31(1), pages 50-84, February.
    2. José Niño-Mora, 2020. "A Verification Theorem for Threshold-Indexability of Real-State Discounted Restless Bandits," Mathematics of Operations Research, INFORMS, vol. 45(2), pages 465-496, May.
    3. Dimitris Bertsimas & Velibor V. Mišić, 2016. "Decomposable Markov Decision Processes: A Fluid Optimization Approach," Operations Research, INFORMS, vol. 64(6), pages 1537-1555, December.
    4. Shaler Stidham, 2002. "Analysis, Design, and Control of Queueing Systems," Operations Research, INFORMS, vol. 50(1), pages 197-216, February.
    5. Hellerstein, Lisa & Lidbetter, Thomas, 2023. "A game theoretic approach to a problem in polymatroid maximization," European Journal of Operational Research, Elsevier, vol. 305(2), pages 979-988.
    6. José Niño-Mora, 2000. "On certain greedoid polyhedra, partially indexable scheduling problems and extended restless bandit allocation indices," Economics Working Papers 456, Department of Economics and Business, Universitat Pompeu Fabra.
    7. Santiago R. Balseiro & Ozan Candogan, 2017. "Optimal Contracts for Intermediaries in Online Advertising," Operations Research, INFORMS, vol. 65(4), pages 878-896, August.
    8. Dimitris Bertsimas & José Niño-Mora, 1996. "Optimization of multiclass queueing networks with changeover times via the achievable region approach: Part I, the single-station case," Economics Working Papers 302, Department of Economics and Business, Universitat Pompeu Fabra, revised Jul 1998.
    9. José Niño-Mora, 2022. "Multi-Gear Bandits, Partial Conservation Laws, and Indexability," Mathematics, MDPI, vol. 10(14), pages 1-31, July.
    10. Tianhu Deng & Ying‐Ju Chen & Zuo‐Jun Max Shen, 2015. "Optimal pricing and scheduling control of product shipping," Naval Research Logistics (NRL), John Wiley & Sons, vol. 62(3), pages 215-227, April.
    11. Kevin D. Glazebrook & José Niño-Mora, 2001. "Parallel Scheduling of Multiclass M/M/m Queues: Approximate and Heavy-Traffic Optimization of Achievable Performance," Operations Research, INFORMS, vol. 49(4), pages 609-623, August.
    12. Dimitris Bertsimas & José Niño-Mora, 1999. "Optimization of Multiclass Queueing Networks with Changeover Times Via the Achievable Region Approach: Part II, The Multi-Station Case," Mathematics of Operations Research, INFORMS, vol. 24(2), pages 331-361, May.
    13. Bertsimas, Dimitris. & Niño-Mora, Jose., 1994. "Restless bandit, linear programming relaxations and a primal-dual heuristic," Working papers 3727-94., Massachusetts Institute of Technology (MIT), Sloan School of Management.
    14. Dimitris Bertsimas & José Niño-Mora, 1996. "Optimization of multiclass queueing networks with changeover times via the achievable region method: Part II, the multi-station case," Economics Working Papers 314, Department of Economics and Business, Universitat Pompeu Fabra, revised Aug 1998.
    15. Esther Frostig & Gideon Weiss, 2016. "Four proofs of Gittins’ multiarmed bandit theorem," Annals of Operations Research, Springer, vol. 241(1), pages 127-165, June.
    16. Dimitris Bertsimas & José Niño-Mora, 1994. "Restless bandits, linear programming relaxations and a primal-dual index heuristic," Economics Working Papers 301, Department of Economics and Business, Universitat Pompeu Fabra, revised Oct 1997.
    17. José Niño-Mora, 2020. "Fast Two-Stage Computation of an Index Policy for Multi-Armed Bandits with Setup Delays," Mathematics, MDPI, vol. 9(1), pages 1-36, December.
    18. Bertsimas, Dimitris., 1995. "The achievable region method in the optimal control of queueing systems : formulations, bounds and policies," Working papers 3837-95., Massachusetts Institute of Technology (MIT), Sloan School of Management.
    19. Baris Ata & Yichuan Ding & Stefanos Zenios, 2021. "An Achievable-Region-Based Approach for Kidney Allocation Policy Design with Endogenous Patient Choice," Manufacturing & Service Operations Management, INFORMS, vol. 23(1), pages 36-54, 1-2.
    20. R. Garbe & K. D. Glazebrook, 1998. "Submodular Returns and Greedy Heuristics for Queueing Scheduling Problems," Operations Research, INFORMS, vol. 46(3), pages 336-346, June.

    Corrections

    All material on this site has been provided by the respective publishers and authors. You can help correct errors and omissions. When requesting a correction, please mention this item's handle: RePEc:inm:oropre:v:48:y:2000:i:1:p:80-90. See general information about how to correct material in RePEc.

    If you have authored this item and are not yet registered with RePEc, we encourage you to do it here. This allows to link your profile to this item. It also allows you to accept potential citations to this item that we are uncertain about.

    If CitEc recognized a bibliographic reference but did not link an item in RePEc to it, you can help with this form .

    If you know of missing items citing this one, you can help us creating those links by adding the relevant references in the same way as above, for each refering item. If you are a registered author of this item, you may also want to check the "citations" tab in your RePEc Author Service profile, as there may be some citations waiting for confirmation.

    For technical questions regarding this item, or to correct its authors, title, abstract, bibliographic or download information, contact: Chris Asher (email available below). General contact details of provider: https://edirc.repec.org/data/inforea.html .

    Please note that corrections may take a couple of weeks to filter through the various RePEc services.

    IDEAS is a RePEc service. RePEc uses bibliographic data supplied by the respective publishers.