IDEAS home Printed from https://ideas.repec.org/a/inm/ormoor/v43y2018i2p399-427.html
   My bibliography  Save this article

Two-Armed Restless Bandits with Imperfect Information: Stochastic Control and Indexability

Author

Listed:
  • Roland Fryer

    (Harvard University, Cambridge, Massachusetts 02138 and National Bureau of Economic Research (NBER), Cambridge, Massachusetts)

  • Philipp Harms

    (Freiburg University, 79085 Freiburg, Germany)

Abstract

We present a two-armed bandit model of decision making under uncertainty where the expected return to investing in the “risky arm” increases when choosing that arm and decreases when choosing the “safe” arm. These dynamics are natural in applications such as human capital development, job search, and occupational choice. Using new insights from stochastic control, along with a monotonicity condition on the payoff dynamics, we show that optimal strategies in our model are stopping rules that can be characterized by an index which formally coincides with Gittins’ index. Our result implies the indexability of a new class of restless bandit models.

Suggested Citation

  • Roland Fryer & Philipp Harms, 2018. "Two-Armed Restless Bandits with Imperfect Information: Stochastic Control and Indexability," Mathematics of Operations Research, INFORMS, vol. 43(2), pages 399-427, May.
  • Handle: RePEc:inm:ormoor:v:43:y:2018:i:2:p:399-427
    DOI: 10.1287/moor.2017.0863
    as

    Download full text from publisher

    File URL: https://doi.org/10.1287/moor.2017.0863
    Download Restriction: no

    File URL: https://libkey.io/10.1287/moor.2017.0863?utm_source=ideas
    LibKey link: if access is restricted and if your library uses this service, LibKey will redirect you to where you can use your library subscription to access this item
    ---><---

    References listed on IDEAS

    as
    1. Asaf Cohen & Eilon Solan, 2013. "Bandit Problems with Lévy Processes," Mathematics of Operations Research, INFORMS, vol. 38(1), pages 92-107, February.
    2. Godfrey Keller & Sven Rady & Martin Cripps, 2005. "Strategic Experimentation with Exponential Bandits," Econometrica, Econometric Society, vol. 73(1), pages 39-68, January.
    3. Patrick Bolton & Christopher Harris, 1999. "Strategic Experimentation," Econometrica, Econometric Society, vol. 67(2), pages 349-374, March.
    4. Rothschild, Michael, 1974. "A two-armed bandit theory of market pricing," Journal of Economic Theory, Elsevier, vol. 9(2), pages 185-202, October.
    5. , & ,, 2010. "Strategic experimentation with Poisson bandits," Theoretical Economics, Econometric Society, vol. 5(2), May.
    6. McCall, B P & McCall, J J, 1987. "A Sequential Study of Migration and Job Search," Journal of Labor Economics, University of Chicago Press, vol. 5(4), pages 452-476, October.
    7. Daron Acemoglu & Munther A. Dahleh & Ilan Lobel & Asuman Ozdaglar, 2011. "Bayesian Learning in Social Networks," Review of Economic Studies, Oxford University Press, vol. 78(4), pages 1201-1236.
    8. Banks, Jeffrey S & Sundaram, Rangarajan K, 1994. "Switching Costs and the Gittins Index," Econometrica, Econometric Society, vol. 62(3), pages 687-694, May.
    9. Kohlmann, M., 1982. "Existence of optimal controls for a partially observed semimartingale," Stochastic Processes and their Applications, Elsevier, vol. 13(2), pages 215-226, August.
    10. Flavio Cunha & James J. HECKMAN, 2009. "Investing in our Young People," Rivista Internazionale di Scienze Sociali, Vita e Pensiero, Pubblicazioni dell'Universita' Cattolica del Sacro Cuore, vol. 117(3), pages 387-418.
    11. Weitzman, Martin L, 1979. "Optimal Search for the Best Alternative," Econometrica, Econometric Society, vol. 47(3), pages 641-654, May.
    12. Cunha, Flavio & Heckman, James J. & Lochner, Lance, 2006. "Interpreting the Evidence on Life Cycle Skill Formation," Handbook of the Economics of Education, in: Erik Hanushek & F. Welch (ed.), Handbook of the Economics of Education, edition 1, volume 1, chapter 12, pages 697-812, Elsevier.
    13. Banks, Jeffrey S & Sundaram, Rangarajan K, 1992. "Denumerable-Armed Bandits," Econometrica, Econometric Society, vol. 60(5), pages 1071-1096, September.
    14. Kroft, Kory & Lange, Fabian & Notowidigdo, Matthew J., 2012. "Duration Dependence and Labor Market Conditions: Theory and Evidence from a Field Experiment," CLSSRN working papers clsrn_admin-2012-21, Vancouver School of Economics, revised 28 Sep 2012.
    15. Hammond, Peter & Myles, Gareth (ed.), 2000. "Incentives, Organization, and Public Economics: Papers in Honour of Sir James Mirrlees," OUP Catalogue, Oxford University Press, number 9780199242290, November.
    16. Will Dobbie & Roland G. Fryer, Jr, 2011. "Getting Beneath the Veil of Effective Schools: Evidence from New York City," NBER Working Papers 17632, National Bureau of Economic Research, Inc.
    Full references (including those not matched with items on IDEAS)

    Most related items

    These are the items that most often cite the same works as this one and are cited by the same works as this one.
    1. Roland G. Fryer, Jr. & Philipp Harms, 2013. "Two-Armed Restless Bandits with Imperfect Information: Stochastic Control and Indexability," NBER Working Papers 19043, National Bureau of Economic Research, Inc.
    2. Forand, Jean Guillaume, 2015. "Keeping your options open," Journal of Economic Dynamics and Control, Elsevier, vol. 53(C), pages 47-68.
    3. Keller, Godfrey & Novák, Vladimír & Willems, Tim, 2019. "A note on optimal experimentation under risk aversion," Journal of Economic Theory, Elsevier, vol. 179(C), pages 476-487.
    4. Sorensen, Morten, 2007. "Learning by Investing: Evidence from Venture Capital," SIFR Research Report Series 53, Institute for Financial Research.
    5. Asaf Cohen & Eilon Solan, 2013. "Bandit Problems with Lévy Processes," Mathematics of Operations Research, INFORMS, vol. 38(1), pages 92-107, February.
    6. Keller, Godfrey & Rady, Sven, 2020. "Undiscounted bandit games," Games and Economic Behavior, Elsevier, vol. 124(C), pages 43-61.
    7. Deimen, Inga & Wirtz, Julia, 2016. "A Bandit Model of Two-Dimensional Uncertainty -- Rationalizing Mindsets," VfS Annual Conference 2016 (Augsburg): Demographic Change 145931, Verein für Socialpolitik / German Economic Association.
    8. Camargo, Braz, 2014. "Learning in society," Games and Economic Behavior, Elsevier, vol. 87(C), pages 381-396.
    9. Nicolas Klein & Sven Rady, 2011. "Negatively Correlated Bandits," Review of Economic Studies, Oxford University Press, vol. 78(2), pages 693-732.
    10. Keller, Godfrey & Oldale, Alison, 2003. "Branching bandits: a sequential search process with correlated pay-offs," Journal of Economic Theory, Elsevier, vol. 113(2), pages 302-315, December.
    11. Heidhues, Paul & Rady, Sven & Strack, Philipp, 2015. "Strategic experimentation with private payoffs," Journal of Economic Theory, Elsevier, vol. 159(PA), pages 531-551.
    12. Klein, Nicolas, 2013. "Strategic learning in teams," Games and Economic Behavior, Elsevier, vol. 82(C), pages 636-657.
    13. Agbo, Maxime, 2015. "A perpetual search for talents across overlapping generations: A learning process," Mathematical Social Sciences, Elsevier, vol. 76(C), pages 131-145.
    14. Thijssen, Jacco J.J. & Bregantini, Daniele, 2017. "Costly sequential experimentation and project valuation with an application to health technology assessment," Journal of Economic Dynamics and Control, Elsevier, vol. 77(C), pages 202-229.
    15. Johannes Hörner & Larry Samuelson, 2013. "Incentives for experimenting agents," RAND Journal of Economics, RAND Corporation, vol. 44(4), pages 632-663, December.
    16. , & ,, 2010. "Strategic experimentation with Poisson bandits," Theoretical Economics, Econometric Society, vol. 5(2), May.
    17. Doruk Cetemen & Can Urgun & Leeat Yariv, 2021. "Collective Progress: Dynamics of Exit Waves," NBER Working Papers 29008, National Bureau of Economic Research, Inc.
    18. Kohei Kawaguchi, 2021. "When Will Workers Follow an Algorithm? A Field Experiment with a Retail Business," Management Science, INFORMS, vol. 67(3), pages 1670-1695, March.
    19. Maloney,William F. & Zambrano,Andrés, 2021. "Learning to Learn : Experimentation, Entrepreneurial Capital, and Development," Policy Research Working Paper Series 9890, The World Bank.
    20. Bergemann, Dirk & Valimaki, Juuso, 2001. "Stationary multi-choice bandit problems," Journal of Economic Dynamics and Control, Elsevier, vol. 25(10), pages 1585-1594, October.

    Corrections

    All material on this site has been provided by the respective publishers and authors. You can help correct errors and omissions. When requesting a correction, please mention this item's handle: RePEc:inm:ormoor:v:43:y:2018:i:2:p:399-427. See general information about how to correct material in RePEc.

    For technical questions regarding this item, or to correct its authors, title, abstract, bibliographic or download information, contact: . General contact details of provider: https://edirc.repec.org/data/inforea.html .

    If you have authored this item and are not yet registered with RePEc, we encourage you to do it here. This allows to link your profile to this item. It also allows you to accept potential citations to this item that we are uncertain about.

    If CitEc recognized a bibliographic reference but did not link an item in RePEc to it, you can help with this form .

    If you know of missing items citing this one, you can help us creating those links by adding the relevant references in the same way as above, for each refering item. If you are a registered author of this item, you may also want to check the "citations" tab in your RePEc Author Service profile, as there may be some citations waiting for confirmation.

    For technical questions regarding this item, or to correct its authors, title, abstract, bibliographic or download information, contact: Matthew Walls (email available below). General contact details of provider: https://edirc.repec.org/data/inforea.html .

    Please note that corrections may take a couple of weeks to filter through the various RePEc services.

    IDEAS is a RePEc service hosted by the Research Division of the Federal Reserve Bank of St. Louis . RePEc uses bibliographic data supplied by the respective publishers.