IDEAS home Printed from https://ideas.repec.org/p/cwl/cwldpp/1551.html
   My bibliography  Save this paper

Bandit Problems

Author

Listed:

Abstract

We survey the literature on multi-armed bandit models and their applications in economics. The multi-armed bandit problem is a statistical decision model of an agent trying to optimize his decisions while improving his information at the same time. This classic problem has received much attention in economics as it concisely models the trade-off between exploration (trying out each arm to find the best one) and exploitation (playing the arm believed to give the best payoff).

Suggested Citation

  • Dirk Bergemann & Juuso Valimaki, 2006. "Bandit Problems," Cowles Foundation Discussion Papers 1551, Cowles Foundation for Research in Economics, Yale University.
  • Handle: RePEc:cwl:cwldpp:1551
    Note: CFP 1292
    as

    Download full text from publisher

    File URL: https://cowles.yale.edu/sites/default/files/files/pub/d15/d1551.pdf
    Download Restriction: no
    ---><---

    Citations

    Citations are extracted by the CitEc Project, subscribe to its RSS feed for this item.
    as


    Cited by:

    1. Roee Teper, 2016. "Learning the Krepsian State: Exploration Through Consumption," Working Paper 5860, Department of Economics, University of Pittsburgh.
    2. Warren, Patrick L. & Wilkening, Tom S., 2012. "Regulatory fog: The role of information in regulatory persistence," Journal of Economic Behavior & Organization, Elsevier, vol. 84(3), pages 840-856.
    3. Lori Beaman & Raghabendra Chattopadhyay & Esther Duflo & Rohini Pande & Petia Topalova, 2009. "Powerful Women: Does Exposure Reduce Bias?," The Quarterly Journal of Economics, President and Fellows of Harvard College, vol. 124(4), pages 1497-1540.
    4. Ufuk Akcigit & Qingmin Liu, 2011. "The Role of Information in Competitive Experimentation," Levine's Working Paper Archive 786969000000000321, David K. Levine.
    5. , & ,, 2010. "Strategic experimentation with Poisson bandits," Theoretical Economics, Econometric Society, vol. 5(2), May.
    6. Berndt, Ernst R. & Gibbons, Robert S. & Kolotilin, Anton & Taub, Anna Levine, 2015. "The heterogeneity of concentrated prescribing behavior: Theory and evidence from antipsychotics," Journal of Health Economics, Elsevier, vol. 40(C), pages 26-39.
    7. Nicolas Klein & Sven Rady, 2011. "Negatively Correlated Bandits," The Review of Economic Studies, Review of Economic Studies Ltd, vol. 78(2), pages 693-732.
    8. Rosenberg, Dinah & Salomon, Antoine & Vieille, Nicolas, 2013. "On games of strategic experimentation," Games and Economic Behavior, Elsevier, vol. 82(C), pages 31-51.
    9. Patrick Warren & Tom Wilkening, 2010. "Regulatory Fog: The Informational Origins of Regulatory Persistence," Department of Economics - Working Papers Series 1113, The University of Melbourne.
    10. Sorensen, Morten, 2007. "Learning by Investing: Evidence from Venture Capital," SIFR Research Report Series 53, Institute for Financial Research.
    11. Piermont, Evan & Takeoka, Norio & Teper, Roee, 2016. "Learning the Krepsian state: Exploration through consumption," Games and Economic Behavior, Elsevier, vol. 100(C), pages 69-94.
    12. Cripps, Martin W., 2013. "Optimal learning of a set: Or how to edit a journal if you must," Economics Letters, Elsevier, vol. 120(3), pages 384-388.
    13. Deb, Rahul, 2008. "Optimal Contracting Of New Experience Goods," MPRA Paper 9880, University Library of Munich, Germany.
    14. Eitan Altman, 2007. "Comments on: Dynamic priority allocation via restless bandit marginal productivity indices," TOP: An Official Journal of the Spanish Society of Statistics and Operations Research, Springer;Sociedad de Estadística e Investigación Operativa, vol. 15(2), pages 202-207, December.
    15. Springborn, Michael R., 2014. "Risk aversion and adaptive management: Insights from a multi-armed bandit model of invasive species risk," Journal of Environmental Economics and Management, Elsevier, vol. 68(2), pages 226-242.
    16. Ramana Nanda & Matthew Rhodes-Kropf, 2012. "Innovation Policies," Harvard Business School Working Papers 13-038, Harvard Business School, revised Mar 2017.

    More about this item

    Keywords

    One-Armed Bandit; Multi-Armed Bandit; Bayesian Learning; Experimentation; Index Policy; Matching; Experience Goods;
    All these keywords.

    JEL classification:

    • C72 - Mathematical and Quantitative Methods - - Game Theory and Bargaining Theory - - - Noncooperative Games
    • C73 - Mathematical and Quantitative Methods - - Game Theory and Bargaining Theory - - - Stochastic and Dynamic Games; Evolutionary Games
    • D43 - Microeconomics - - Market Structure, Pricing, and Design - - - Oligopoly and Other Forms of Market Imperfection
    • D83 - Microeconomics - - Information, Knowledge, and Uncertainty - - - Search; Learning; Information and Knowledge; Communication; Belief; Unawareness

    Statistics

    Access and download statistics

    Corrections

    All material on this site has been provided by the respective publishers and authors. You can help correct errors and omissions. When requesting a correction, please mention this item's handle: RePEc:cwl:cwldpp:1551. See general information about how to correct material in RePEc.

    If you have authored this item and are not yet registered with RePEc, we encourage you to do it here. This allows to link your profile to this item. It also allows you to accept potential citations to this item that we are uncertain about.

    We have no bibliographic references for this item. You can help adding them by using this form .

    If you know of missing items citing this one, you can help us creating those links by adding the relevant references in the same way as above, for each refering item. If you are a registered author of this item, you may also want to check the "citations" tab in your RePEc Author Service profile, as there may be some citations waiting for confirmation.

    For technical questions regarding this item, or to correct its authors, title, abstract, bibliographic or download information, contact: Brittany Ladd (email available below). General contact details of provider: https://edirc.repec.org/data/cowleus.html .

    Please note that corrections may take a couple of weeks to filter through the various RePEc services.

    IDEAS is a RePEc service. RePEc uses bibliographic data supplied by the respective publishers.