IDEAS home Printed from https://ideas.repec.org/a/eee/gamebe/v124y2020icp43-61.html
   My bibliography  Save this article

Undiscounted bandit games

Author

Listed:
  • Keller, Godfrey
  • Rady, Sven

Abstract

We analyze undiscounted continuous-time games of strategic experimentation with two-armed bandits. The risky arm generates payoffs according to a Lévy process with an unknown average payoff per unit of time which nature draws from an arbitrary finite set. Observing all actions and realized payoffs, plus a free background signal, players use Markov strategies with the common posterior belief about the unknown parameter as the state variable. We show that the unique symmetric Markov perfect equilibrium can be computed in a simple closed form involving only the payoff of the safe arm, the expected current payoff of the risky arm, and the expected full-information payoff, given the current belief. In particular, the equilibrium does not depend on the precise specification of the payoff-generating processes.

Suggested Citation

  • Keller, Godfrey & Rady, Sven, 2020. "Undiscounted bandit games," Games and Economic Behavior, Elsevier, vol. 124(C), pages 43-61.
  • Handle: RePEc:eee:gamebe:v:124:y:2020:i:c:p:43-61
    DOI: 10.1016/j.geb.2020.08.003
    as

    Download full text from publisher

    File URL: http://www.sciencedirect.com/science/article/pii/S0899825620301111
    Download Restriction: Full text for ScienceDirect subscribers only

    As the access to this document is restricted, you may want to look for a different version below or search for a different version of it.

    Other versions of this item:

    References listed on IDEAS

    as
    1. Asaf Cohen & Eilon Solan, 2013. "Bandit Problems with Lévy Processes," Mathematics of Operations Research, INFORMS, vol. 38(1), pages 92-107, February.
    2. Godfrey Keller & Sven Rady, 1999. "Optimal Experimentation in a Changing Environment," Review of Economic Studies, Oxford University Press, vol. 66(3), pages 475-507.
    3. Bergemann, Dirk & Valimaki, Juuso, 2002. "Entry and Vertical Differentiation," Journal of Economic Theory, Elsevier, vol. 106(1), pages 91-125, September.
    4. Godfrey Keller & Sven Rady & Martin Cripps, 2005. "Strategic Experimentation with Exponential Bandits," Econometrica, Econometric Society, vol. 73(1), pages 39-68, January.
    5. Patrick Bolton & Christopher Harris, 1999. "Strategic Experimentation," Econometrica, Econometric Society, vol. 67(2), pages 349-374, March.
    6. Dirk Bergemann & Juuso Valimaki, 1997. "Market Diffusion with Two-Sided Learning," RAND Journal of Economics, The RAND Corporation, vol. 28(4), pages 773-795, Winter.
    7. Ke, T. Tony & Villas-Boas, J. Miguel, 2019. "Optimal learning before choice," Journal of Economic Theory, Elsevier, vol. 180(C), pages 383-437.
    8. Keller, Godfrey & Rady, Sven, 2010. "Strategic experimentation with Poisson bandits," Theoretical Economics, Econometric Society, vol. 5(2), May.
    9. Martin Peitz & Sven Rady & Piers Trepper, 2017. "Experimentation in Two-Sided Markets," Journal of the European Economic Association, European Economic Association, vol. 15(1), pages 128-172.
    10. Christopher Harris, 1993. "Generalized Solutions of Stochastic Differential Games in One Dimension," Papers 0044, Boston University - Industry Studies Programme.
    11. Pietro Veronesi, 2000. "How Does Information Quality Affect Stock Returns?," Journal of Finance, American Finance Association, vol. 55(2), pages 807-837, April.
    12. Alessandro Bonatti, 2011. "Menu Pricing and Learning," American Economic Journal: Microeconomics, American Economic Association, vol. 3(3), pages 124-163, August.
    13. Dutta, Prajit K., 1991. "What do discounted optima converge to?: A theory of discount rate asymptotics in economic models," Journal of Economic Theory, Elsevier, vol. 55(1), pages 64-94, October.
    14. Moscarini, Giuseppe & Squintani, Francesco, 2010. "Competitive experimentation with private information: The survivor's curse," Journal of Economic Theory, Elsevier, vol. 145(2), pages 639-660, March.
    15. Jovanovic, Boyan, 1979. "Job Matching and the Theory of Turnover," Journal of Political Economy, University of Chicago Press, vol. 87(5), pages 972-990, October.
    16. Keller, Godfrey & Rady, Sven, 2003. "Price Dispersion and Learning in a Dynamic Differentiated-Goods Duopoly," RAND Journal of Economics, The RAND Corporation, vol. 34(1), pages 138-165, Spring.
    17. Dutta, P.K., 1991. "What Do Discounted Optima Converge To? A Theory of Discount Rate Asymptotics in Economic Models," RCER Working Papers 264, University of Rochester - Center for Economic Research (RCER).
    Full references (including those not matched with items on IDEAS)

    Most related items

    These are the items that most often cite the same works as this one and are cited by the same works as this one.
    1. Weng, Xi, 2015. "Dynamic pricing in the presence of individual learning," Journal of Economic Theory, Elsevier, vol. 155(C), pages 262-299.
    2. Alessandro Bonatti, 2008. "Continuous-Time Screening Contracts," 2008 Meeting Papers 493, Society for Economic Dynamics.
    3. Keller, Godfrey & Rady, Sven, 2015. "Breakdowns," Theoretical Economics, Econometric Society, vol. 10(1), January.
    4. Martin Peitz & Sven Rady & Piers Trepper, 2017. "Experimentation in Two-Sided Markets," Journal of the European Economic Association, European Economic Association, vol. 15(1), pages 128-172.
    5. Décamps, Jean-Paul & Mariotti, Thomas & Villeneuve, Stéphane, 2000. "Investment Timing under Incomplete Information," IDEI Working Papers 115, Institut d'Économie Industrielle (IDEI), Toulouse, revised Apr 2004.
    6. Rosenberg, Dinah & Salomon, Antoine & Vieille, Nicolas, 2013. "On games of strategic experimentation," Games and Economic Behavior, Elsevier, vol. 82(C), pages 31-51.
    7. Dinah Rosenberg & Eilon Solan & Nicolas Vieille, 2007. "Social Learning in One-Arm Bandit Problems," Econometrica, Econometric Society, vol. 75(6), pages 1591-1611, November.
    8. Nicolas KLEIN & Peter WAGNER, 2018. "Strategic Investment and Learning with Private Information," Cahiers de recherche 13-2018, Centre interuniversitaire de recherche en économie quantitative, CIREQ.
    9. Eeckhout, Jan & Weng, Xi, 2015. "Common value experimentation," Journal of Economic Theory, Elsevier, vol. 160(C), pages 317-339.
    10. Godfrey Keller & Sven Rady, 1998. "Market Experimentation in a Dynamic Differentiated-Goods Duopoly," Game Theory and Information 9810001, University Library of Munich, Germany, revised 20 Aug 1999.
    11. Axel Anderson & Luís M. B. Cabral, 2007. "Go for broke or play it safe? Dynamic competition with choice of variance," RAND Journal of Economics, RAND Corporation, vol. 38(3), pages 593-609, September.
    12. Strulovici, Bruno & Szydlowski, Martin, 2015. "On the smoothness of value functions and the existence of optimal strategies in diffusion models," Journal of Economic Theory, Elsevier, vol. 159(PB), pages 1016-1055.
    13. Roland Fryer & Philipp Harms, 2018. "Two-Armed Restless Bandits with Imperfect Information: Stochastic Control and Indexability," Mathematics of Operations Research, INFORMS, vol. 43(2), pages 399-427, May.
    14. Dinah Rosenberg & Eilon Solan & Nicolas Vieille, 2004. "Timing Games with Informational Externalities," Levine's Working Paper Archive 122247000000000704, David K. Levine.
    15. Bonatti, Alessandro & Hörner, Johannes, 2017. "Learning to disagree in a game of experimentation," Journal of Economic Theory, Elsevier, vol. 169(C), pages 234-269.
    16. Keller, Godfrey & Novák, Vladimír & Willems, Tim, 2019. "A note on optimal experimentation under risk aversion," Journal of Economic Theory, Elsevier, vol. 179(C), pages 476-487.
    17. Kaustav Das, 2014. "Strategic Experimentation with Competition and Private Arrival of Information," Discussion Papers 1404, University of Exeter, Department of Economics.
    18. Francis Bloch & Simona Fabrizi & Steffen Lippert, 2015. "Learning and collusion in new markets with uncertain entry costs," Economic Theory, Springer;Society for the Advancement of Economic Theory (SAET), vol. 58(2), pages 273-303, February.
    19. Jean-Paul Décamps & Thomas Mariotti & Stéphane Villeneuve, 2005. "Investment Timing Under Incomplete Information," Mathematics of Operations Research, INFORMS, vol. 30(2), pages 472-500, May.
    20. Heidhues, Paul & Rady, Sven & Strack, Philipp, 2015. "Strategic experimentation with private payoffs," Journal of Economic Theory, Elsevier, vol. 159(PA), pages 531-551.

    More about this item

    Keywords

    Strategic experimentation; Bayesian two-armed bandit; Strong long-run average criterion; Markov perfect equilibrium; HJB equation; Viscosity solution;
    All these keywords.

    JEL classification:

    • C73 - Mathematical and Quantitative Methods - - Game Theory and Bargaining Theory - - - Stochastic and Dynamic Games; Evolutionary Games
    • D83 - Microeconomics - - Information, Knowledge, and Uncertainty - - - Search; Learning; Information and Knowledge; Communication; Belief; Unawareness

    Statistics

    Access and download statistics

    Corrections

    All material on this site has been provided by the respective publishers and authors. You can help correct errors and omissions. When requesting a correction, please mention this item's handle: RePEc:eee:gamebe:v:124:y:2020:i:c:p:43-61. See general information about how to correct material in RePEc.

    For technical questions regarding this item, or to correct its authors, title, abstract, bibliographic or download information, contact: (Haili He). General contact details of provider: http://www.elsevier.com/locate/inca/622836 .

    If you have authored this item and are not yet registered with RePEc, we encourage you to do it here. This allows to link your profile to this item. It also allows you to accept potential citations to this item that we are uncertain about.

    If CitEc recognized a reference but did not link an item in RePEc to it, you can help with this form .

    If you know of missing items citing this one, you can help us creating those links by adding the relevant references in the same way as above, for each refering item. If you are a registered author of this item, you may also want to check the "citations" tab in your RePEc Author Service profile, as there may be some citations waiting for confirmation.

    Please note that corrections may take a couple of weeks to filter through the various RePEc services.

    IDEAS is a RePEc service hosted by the Research Division of the Federal Reserve Bank of St. Louis . RePEc uses bibliographic data supplied by the respective publishers.