IDEAS home Printed from https://ideas.repec.org/p/arx/papers/2404.19116.html

Disentangling Exploration from Exploitation

Author

Listed:
  • Alessandro Lizzeri
  • Eran Shmaya
  • Leeat Yariv

Abstract

Starting from Robbins (1952), the literature on experimentation via multi-armed bandits has wed exploration and exploitation. Nonetheless, in many applications, agents' exploration and exploitation need not be intertwined: a policymaker may assess new policies different than the status quo; an investor may evaluate projects outside her portfolio. We characterize the optimal experimentation policy when exploration and exploitation are disentangled in the case of Poisson bandits, allowing for general news structures. The optimal policy features complete learning asymptotically, exhibits lots of persistence, but cannot be identified by an index a la Gittins. Disentanglement is particularly valuable for intermediate parameter values.

Suggested Citation

  • Alessandro Lizzeri & Eran Shmaya & Leeat Yariv, 2024. "Disentangling Exploration from Exploitation," Papers 2404.19116, arXiv.org.
  • Handle: RePEc:arx:papers:2404.19116
    as

    Download full text from publisher

    File URL: http://arxiv.org/pdf/2404.19116
    File Function: Latest version
    Download Restriction: no
    ---><---

    Other versions of this item:

    References listed on IDEAS

    as
    1. Miller, Robert A, 1984. "Job Matching and Occupational Choice," Journal of Political Economy, University of Chicago Press, vol. 92(6), pages 1086-1120, December.
    2. Janet M. Currie & W. Bentley MacLeod, 2020. "Understanding Doctor Decision Making: The Case of Depression Treatment," Econometrica, Econometric Society, vol. 88(3), pages 847-878, May.
    3. Janet M. Currie & W. Bentley MacLeod, 2018. "Understanding Doctor Decision Making: The Case of Depression," NBER Working Papers 24955, National Bureau of Economic Research, Inc.
    4. Bergemann, Dirk & Hege, Ulrich, 1998. "Venture capital financing, moral hazard, and learning," Journal of Banking & Finance, Elsevier, vol. 22(6-8), pages 703-735, August.
    5. Annie Liang & Xiaosheng Mu & Vasilis Syrgkanis, 2022. "Dynamically Aggregating Diverse Information," Econometrica, Econometric Society, vol. 90(1), pages 47-80, January.
    6. Yeon-Koo Che & Konrad Mierendorff, 2019. "Optimal Dynamic Allocation of Attention," American Economic Review, American Economic Association, vol. 109(8), pages 2993-3029, August.
    7. Johannes Hörner & Larry Samuelson, 2013. "Incentives for experimenting agents," RAND Journal of Economics, RAND Corporation, vol. 44(4), pages 632-663, December.
    8. Sims, Christopher A., 2003. "Implications of rational inattention," Journal of Monetary Economics, Elsevier, vol. 50(3), pages 665-690, April.
    9. Ettore Damiano & Hao Li & Wing Suen, 2020. "Learning While Experimenting," The Economic Journal, Royal Economic Society, vol. 130(625), pages 65-92.
    10. Jovanovic, Boyan, 1979. "Job Matching and the Theory of Turnover," Journal of Political Economy, University of Chicago Press, vol. 87(5), pages 972-990, October.
    11. Yeon-Koo Che & Johannes Hörner, 2018. "Recommender Systems as Mechanisms for Social Learning," The Quarterly Journal of Economics, President and Fellows of Harvard College, vol. 133(2), pages 871-925.
    12. Gilles Stoltz & Sébastien Bubeck & Rémi Munos, 2011. "Pure exploration in finitely-armed and continuous-armed bandits," Post-Print hal-00609550, HAL.
    13. repec:cwl:cwldpp:1726rrr is not listed on IDEAS
    14. Godfrey Keller & Sven Rady & Martin Cripps, 2005. "Strategic Experimentation with Exponential Bandits," Econometrica, Econometric Society, vol. 73(1), pages 39-68, January.
    15. Patrick Bolton & Christopher Harris, 1999. "Strategic Experimentation," Econometrica, Econometric Society, vol. 67(2), pages 349-374, March.
    16. repec:cwl:cwldpp:1726rr is not listed on IDEAS
    17. Bartosz Maćkowiak & Filip Matějka & Mirko Wiederholt, 2023. "Rational Inattention: A Review," Journal of Economic Literature, American Economic Association, vol. 61(1), pages 226-273, March.
    18. Bruno Strulovici, 2010. "Learning While Voting: Determinants of Collective Experimentation," Econometrica, Econometric Society, vol. 78(3), pages 933-971, May.
    19. Annie Liang & Xiaosheng Mu, 2020. "Complementary Information and Learning Traps," The Quarterly Journal of Economics, President and Fellows of Harvard College, vol. 135(1), pages 389-448.
    20. Rothschild, Michael, 1974. "A two-armed bandit theory of market pricing," Journal of Economic Theory, Elsevier, vol. 9(2), pages 185-202, October.
    21. Yingni Guo, 2016. "Dynamic Delegation of Experimentation," American Economic Review, American Economic Association, vol. 106(8), pages 1969-2008, August.
    22. , & ,, 2010. "Strategic experimentation with Poisson bandits," Theoretical Economics, Econometric Society, vol. 5(2), May.
    Full references (including those not matched with items on IDEAS)

    Citations

    Citations are extracted by the CitEc Project, subscribe to its RSS feed for this item.
    as


    Cited by:

    1. Mariagiovanna Baccara & Gilat Levy & Ronny Razin, 2025. "Research Waves," CESifo Working Paper Series 12248, CESifo.
    2. Polina Borisova & Nikhil Vellodi, 2024. "A Theory of Self-Prospection," PSE Working Papers halshs-04721098, HAL.

    Most related items

    These are the items that most often cite the same works as this one and are cited by the same works as this one.
    1. Heidhues, Paul & Rady, Sven & Strack, Philipp, 2015. "Strategic experimentation with private payoffs," Journal of Economic Theory, Elsevier, vol. 159(PA), pages 531-551.
    2. Chen, Chia-Hui & Ishida, Junichiro, 2018. "Hierarchical experimentation," Journal of Economic Theory, Elsevier, vol. 177(C), pages 365-404.
    3. Sorensen, Morten, 2007. "Learning by Investing: Evidence from Venture Capital," SIFR Research Report Series 53, Institute for Financial Research.
    4. Mira Frick & Yuhta Ishii, 2015. "Innovation Adoption by Forward-Looking Social Learners," Cowles Foundation Discussion Papers 1877, Cowles Foundation for Research in Economics, Yale University.
    5. Forand, Jean Guillaume, 2015. "Keeping your options open," Journal of Economic Dynamics and Control, Elsevier, vol. 53(C), pages 47-68.
    6. Thomas, Caroline, 2019. "Experimentation with reputation concerns – Dynamic signalling with changing types," Journal of Economic Theory, Elsevier, vol. 179(C), pages 366-415.
    7. Yingkai Li & Jonathan Libgober, 2023. "Incentivizing Forecasters to Learn: Summarized vs. Unrestricted Advice," Papers 2310.19147, arXiv.org, revised Dec 2025.
    8. Jan Eeckhout & Xi Weng, 2022. "Assortative Learning," Economica, London School of Economics and Political Science, vol. 89(355), pages 647-688, July.
    9. Chen, Chia-Hui & Ishida, Junichiro & Mukherjee, Arijit, 2023. "Pioneer, early follower or late entrant: Entry dynamics with learning and market competition," European Economic Review, Elsevier, vol. 152(C).
    10. Aubrey Clark & Giovanni Reggiani, 2021. "Contracts for acquiring information," Papers 2103.03911, arXiv.org.
    11. Thomas Greve & Hans Keiding, 2023. "A model of privately funded public research," Journal of Economics, Springer, vol. 140(1), pages 63-91, September.
    12. Weng, Xi, 2015. "Dynamic pricing in the presence of individual learning," Journal of Economic Theory, Elsevier, vol. 155(C), pages 262-299.
    13. Nicolas Klein & Tymofiy Mylovanov, 2011. "Should the Flatterers be Avoided?," 2011 Meeting Papers 1273, Society for Economic Dynamics.
    14. Besanko, David & Tong, Jian & Wu, Jianjun, 2016. "Subsidizing research programs with "if" and "when" uncertainty in the face of severe informational constraints," Discussion Paper Series In Economics And Econometrics 1605, Economics Division, School of Social Sciences, University of Southampton.
    15. Bergemann, Dirk & Valimaki, Juuso, 1996. "Learning and Strategic Pricing," Econometrica, Econometric Society, vol. 64(5), pages 1125-1149, September.
    16. Xie, Yinxi & Xie, Yang, 2017. "Machiavellian experimentation," Journal of Comparative Economics, Elsevier, vol. 45(4), pages 685-711.
    17. Kaustav Das & Nicolas Klein & Katharina Schmid, 2020. "Strategic experimentation with asymmetric players," Economic Theory, Springer;Society for the Advancement of Economic Theory (SAET), vol. 69(4), pages 1147-1175, June.
    18. Caroline D. Thomas, 2021. "Strategic Experimentation with Congestion," American Economic Journal: Microeconomics, American Economic Association, vol. 13(1), pages 1-82, February.
    19. Klein, Nicolas, 2013. "Strategic learning in teams," Games and Economic Behavior, Elsevier, vol. 82(C), pages 636-657.
    20. Farzad Pourbabaee, 2024. "Reputation, learning and project choice in frictional economies," Economic Theory, Springer;Society for the Advancement of Economic Theory (SAET), vol. 78(4), pages 1075-1115, December.

    More about this item

    JEL classification:

    • C73 - Mathematical and Quantitative Methods - - Game Theory and Bargaining Theory - - - Stochastic and Dynamic Games; Evolutionary Games
    • D81 - Microeconomics - - Information, Knowledge, and Uncertainty - - - Criteria for Decision-Making under Risk and Uncertainty
    • D83 - Microeconomics - - Information, Knowledge, and Uncertainty - - - Search; Learning; Information and Knowledge; Communication; Belief; Unawareness
    • O35 - Economic Development, Innovation, Technological Change, and Growth - - Innovation; Research and Development; Technological Change; Intellectual Property Rights - - - Social Innovation

    NEP fields

    This paper has been announced in the following NEP Reports:

    Statistics

    Access and download statistics

    Corrections

    All material on this site has been provided by the respective publishers and authors. You can help correct errors and omissions. When requesting a correction, please mention this item's handle: RePEc:arx:papers:2404.19116. See general information about how to correct material in RePEc.

    If you have authored this item and are not yet registered with RePEc, we encourage you to do it here. This allows to link your profile to this item. It also allows you to accept potential citations to this item that we are uncertain about.

    If CitEc recognized a bibliographic reference but did not link an item in RePEc to it, you can help with this form .

    If you know of missing items citing this one, you can help us creating those links by adding the relevant references in the same way as above, for each refering item. If you are a registered author of this item, you may also want to check the "citations" tab in your RePEc Author Service profile, as there may be some citations waiting for confirmation.

    For technical questions regarding this item, or to correct its authors, title, abstract, bibliographic or download information, contact: arXiv administrators (email available below). General contact details of provider: http://arxiv.org/ .

    Please note that corrections may take a couple of weeks to filter through the various RePEc services.

    IDEAS is a RePEc service. RePEc uses bibliographic data supplied by the respective publishers.