IDEAS home Printed from https://ideas.repec.org/p/cwl/cwldpp/1551.html
   My bibliography  Save this paper

Bandit Problems

Author

Listed:

Abstract

We survey the literature on multi-armed bandit models and their applications in economics. The multi-armed bandit problem is a statistical decision model of an agent trying to optimize his decisions while improving his information at the same time. This classic problem has received much attention in economics as it concisely models the trade-off between exploration (trying out each arm to find the best one) and exploitation (playing the arm believed to give the best payoff).

Suggested Citation

  • Dirk Bergemann & Juuso Valimaki, 2006. "Bandit Problems," Cowles Foundation Discussion Papers 1551, Cowles Foundation for Research in Economics, Yale University.
  • Handle: RePEc:cwl:cwldpp:1551
    Note: CFP 1292
    as

    Download full text from publisher

    File URL: http://cowles.yale.edu/sites/default/files/files/pub/d15/d1551.pdf
    Download Restriction: no

    Citations

    Citations are extracted by the CitEc Project, subscribe to its RSS feed for this item.
    as


    Cited by:

    1. Ufuk Akcigit & Qingmin Liu, 2011. "The Role of Information in Competitive Experimentation," Levine's Working Paper Archive 786969000000000321, David K. Levine.
    2. Nicolas Klein & Sven Rady, 2011. "Negatively Correlated Bandits," Review of Economic Studies, Oxford University Press, vol. 78(2), pages 693-732.
    3. Patrick Warren & Tom Wilkening, 2010. "Regulatory Fog: The Informational Origins of Regulatory Persistence," Department of Economics - Working Papers Series 1113, The University of Melbourne.
    4. Sorensen, Morten, 2007. "Learning by Investing: Evidence from Venture Capital," SIFR Research Report Series 53, Institute for Financial Research.
    5. Lori Beaman & Raghabendra Chattopadhyay & Esther Duflo & Rohini Pande & Petia Topalova, 2009. "Powerful Women: Does Exposure Reduce Bias?," The Quarterly Journal of Economics, Oxford University Press, vol. 124(4), pages 1497-1540.
    6. Keller, Godfrey & Rady, Sven, 2010. "Strategic experimentation with Poisson bandits," Theoretical Economics, Econometric Society, vol. 5(2), May.
    7. Eitan Altman, 2007. "Comments on: Dynamic priority allocation via restless bandit marginal productivity indices," TOP: An Official Journal of the Spanish Society of Statistics and Operations Research, Springer;Sociedad de Estadística e Investigación Operativa, vol. 15(2), pages 202-207, December.
    8. Ramana Nanda & Matthew Rhodes-Kropf, 2012. "Innovation Policies," Harvard Business School Working Papers 13-038, Harvard Business School, revised Mar 2017.
    9. Warren, Patrick L. & Wilkening, Tom S., 2012. "Regulatory fog: The role of information in regulatory persistence," Journal of Economic Behavior & Organization, Elsevier, vol. 84(3), pages 840-856.
    10. Berndt, Ernst R. & Gibbons, Robert S. & Kolotilin, Anton & Taub, Anna Levine, 2015. "The heterogeneity of concentrated prescribing behavior: Theory and evidence from antipsychotics," Journal of Health Economics, Elsevier, vol. 40(C), pages 26-39.
    11. Rosenberg, Dinah & Salomon, Antoine & Vieille, Nicolas, 2013. "On games of strategic experimentation," Games and Economic Behavior, Elsevier, vol. 82(C), pages 31-51.
    12. Cripps, Martin W., 2013. "Optimal learning of a set: Or how to edit a journal if you must," Economics Letters, Elsevier, vol. 120(3), pages 384-388.
    13. Piermont, Evan & Takeoka, Norio & Teper, Roee, 2016. "Learning the Krepsian state: Exploration through consumption," Games and Economic Behavior, Elsevier, vol. 100(C), pages 69-94.
    14. Deb, Rahul, 2008. "Optimal Contracting Of New Experience Goods," MPRA Paper 9880, University Library of Munich, Germany.
    15. Springborn, Michael R., 2014. "Risk aversion and adaptive management: Insights from a multi-armed bandit model of invasive species risk," Journal of Environmental Economics and Management, Elsevier, vol. 68(2), pages 226-242.
    16. Roee Teper, 2016. "Learning the Krepsian State: Exploration Through Consumption," Working Paper 5860, Department of Economics, University of Pittsburgh.

    More about this item

    Keywords

    One-Armed Bandit; Multi-Armed Bandit; Bayesian Learning; Experimentation; Index Policy; Matching; Experience Goods;

    JEL classification:

    • C72 - Mathematical and Quantitative Methods - - Game Theory and Bargaining Theory - - - Noncooperative Games
    • C73 - Mathematical and Quantitative Methods - - Game Theory and Bargaining Theory - - - Stochastic and Dynamic Games; Evolutionary Games
    • D43 - Microeconomics - - Market Structure, Pricing, and Design - - - Oligopoly and Other Forms of Market Imperfection
    • D83 - Microeconomics - - Information, Knowledge, and Uncertainty - - - Search; Learning; Information and Knowledge; Communication; Belief; Unawareness

    NEP fields

    This paper has been announced in the following NEP Reports:

    Statistics

    Access and download statistics

    Corrections

    All material on this site has been provided by the respective publishers and authors. You can help correct errors and omissions. When requesting a correction, please mention this item's handle: RePEc:cwl:cwldpp:1551. See general information about how to correct material in RePEc.

    For technical questions regarding this item, or to correct its authors, title, abstract, bibliographic or download information, contact: (Matthew Regan). General contact details of provider: http://edirc.repec.org/data/cowleus.html .

    If you have authored this item and are not yet registered with RePEc, we encourage you to do it here. This allows to link your profile to this item. It also allows you to accept potential citations to this item that we are uncertain about.

    We have no references for this item. You can help adding them by using this form .

    If you know of missing items citing this one, you can help us creating those links by adding the relevant references in the same way as above, for each refering item. If you are a registered author of this item, you may also want to check the "citations" tab in your RePEc Author Service profile, as there may be some citations waiting for confirmation.

    Please note that corrections may take a couple of weeks to filter through the various RePEc services.

    IDEAS is a RePEc service hosted by the Research Division of the Federal Reserve Bank of St. Louis . RePEc uses bibliographic data supplied by the respective publishers.