IDEAS home Printed from https://ideas.repec.org/a/eee/dyncon/v184y2026ics0165188926000102.html

Optimal allocation strategies in a discrete-time bandit problem

Author

Listed:
  • Hu, Audrey
  • Zou, Liang

Abstract

We study a discrete-time, two-armed “breakthrough” bandit in which an agent allocates a perfectly divisible resource each period between a safe arm and a risky arm. Departing from the binary “either–or” paradigm, we consider continuous allocation strategies and a general success technology F with nonincreasing hazard rate. Using a variational, pathwise approach combined with dynamic programming, we characterize the unique optimal belief–allocation path via a time-invariant backward/forward transformation. The optimal path features interior, tapering allocations that never stop prior to a breakthrough, and it delivers a strictly higher eventual success probability and expected payoff than the optimal binary (bang-bang) benchmark. In the exponential case, the mappings become explicit, making computation immediate and revealing a Goldilocks principle: total planned allocations to exploration is maximized at intermediate task difficulty. The framework highlights comparative dynamics—how entire optimal paths shift with primitives—while remaining robust to the functional form of F.

Suggested Citation

  • Hu, Audrey & Zou, Liang, 2026. "Optimal allocation strategies in a discrete-time bandit problem," Journal of Economic Dynamics and Control, Elsevier, vol. 184(C).
  • Handle: RePEc:eee:dyncon:v:184:y:2026:i:c:s0165188926000102
    DOI: 10.1016/j.jedc.2026.105264
    as

    Download full text from publisher

    File URL: http://www.sciencedirect.com/science/article/pii/S0165188926000102
    Download Restriction: Full text for ScienceDirect subscribers only

    File URL: https://libkey.io/10.1016/j.jedc.2026.105264?utm_source=ideas
    LibKey link: if access is restricted and if your library uses this service, LibKey will redirect you to where you can use your library subscription to access this item
    ---><---

    As the access to this document is restricted, you may want to

    for a different version of it.

    References listed on IDEAS

    as
    1. Caroline D. Thomas, 2021. "Strategic Experimentation with Congestion," American Economic Journal: Microeconomics, American Economic Association, vol. 13(1), pages 1-82, February.
    2. Bergemann, Dirk & Hege, Ulrich, 1998. "Venture capital financing, moral hazard, and learning," Journal of Banking & Finance, Elsevier, vol. 22(6-8), pages 703-735, August.
    3. Catherine Bobtcheff & Raphaël Levy, 2017. "More Haste, Less Speed? Signaling through Investment Timing," American Economic Journal: Microeconomics, American Economic Association, vol. 9(3), pages 148-186, August.
    4. Ivar Ekeland & José Alexandre Scheinkman, 1986. "Transversality Conditions for Some Infinite Horizon Discrete Time Optimization Problems," Mathematics of Operations Research, INFORMS, vol. 11(2), pages 216-229, May.
    5. David A. Malueg & Shunichi O. Tsutsui, 1997. "Dynamic R&D Competition with Learning," RAND Journal of Economics, The RAND Corporation, vol. 28(4), pages 751-772, Winter.
    6. Dinah Rosenberg & Eilon Solan & Nicolas Vieille, 2007. "Social Learning in One-Arm Bandit Problems," Econometrica, Econometric Society, vol. 75(6), pages 1591-1611, November.
    7. David Besanko & Jianjun Wu, 2013. "The Impact of Market Structure and Learning on the Tradeoff between R&D Competition and Cooperation," Journal of Industrial Economics, Wiley Blackwell, vol. 61(1), pages 166-201, March.
    8. Bonatti, Alessandro & Hörner, Johannes, 2017. "Career concerns with exponential learning," Theoretical Economics, Econometric Society, vol. 12(1), January.
    9. Godfrey Keller & Sven Rady & Martin Cripps, 2005. "Strategic Experimentation with Exponential Bandits," Econometrica, Econometric Society, vol. 73(1), pages 39-68, January.
    10. Patrick Bolton & Christopher Harris, 1999. "Strategic Experimentation," Econometrica, Econometric Society, vol. 67(2), pages 349-374, March.
    11. Dirk Bergemann & Ulrigh Hege, 2005. "The Financing of Innovation: Learning and Stopping," RAND Journal of Economics, The RAND Corporation, vol. 36(4), pages 719-752, Winter.
    12. Rothschild, Michael, 1974. "A two-armed bandit theory of market pricing," Journal of Economic Theory, Elsevier, vol. 9(2), pages 185-202, October.
    13. Philippe Aghion & Patrick Bolton & Christopher Harris & Bruno Jullien, 1991. "Optimal Learning by Experimentation," The Review of Economic Studies, Review of Economic Studies Ltd, vol. 58(4), pages 621-654.
    14. Marina Halac & Navin Kartik & Qingmin Liu, 2017. "Contests for Experimentation," Journal of Political Economy, University of Chicago Press, vol. 125(5), pages 1523-1569.
    15. Pauli Murto & Juuso Välimäki, 2011. "Learning and Information Aggregation in an Exit Game," The Review of Economic Studies, Review of Economic Studies Ltd, vol. 78(4), pages 1426-1461.
    16. An, Mark Yuying, 1998. "Logconcavity versus Logconvexity: A Complete Characterization," Journal of Economic Theory, Elsevier, vol. 80(2), pages 350-369, June.
    17. Yu Awaya & Vijay Krishna, 2021. "Startups and Upstarts: Disadvantageous Information in R&D," Journal of Political Economy, University of Chicago Press, vol. 129(2), pages 534-569.
    18. Emilio Calvano & Giacomo Calzolari & Vincenzo Denicolò & Sergio Pastorello, 2020. "Artificial Intelligence, Algorithmic Pricing, and Collusion," American Economic Review, American Economic Association, vol. 110(10), pages 3267-3297, October.
    19. Heidhues, Paul & Rady, Sven & Strack, Philipp, 2015. "Strategic experimentation with private payoffs," Journal of Economic Theory, Elsevier, vol. 159(PA), pages 531-551.
    20. Winston Wei Dou & Itay Goldstein & Yan Ji, 2025. "AI-Powered Trading, Algorithmic Collusion, and Price Efficiency," NBER Working Papers 34054, National Bureau of Economic Research, Inc.
    21. Marina Halac & Navin Kartik & Qingmin Liu, 2016. "Optimal Contracts for Experimentation," The Review of Economic Studies, Review of Economic Studies Ltd, vol. 83(3), pages 1040-1091.
    22. Jay Pil Choi, 1997. "Herd Behavior, the 'Penguin Effect,' and the Suppression of Informational Diffusion: An Analysis of Informational Externalities and Payoff Interdependency," RAND Journal of Economics, The RAND Corporation, vol. 28(3), pages 407-425, Autumn.
    23. Yingni Guo, 2016. "Dynamic Delegation of Experimentation," American Economic Review, American Economic Association, vol. 106(8), pages 1969-2008, August.
    24. , & ,, 2010. "Strategic experimentation with Poisson bandits," Theoretical Economics, Econometric Society, vol. 5(2), May.
    Full references (including those not matched with items on IDEAS)

    Most related items

    These are the items that most often cite the same works as this one and are cited by the same works as this one.
    1. Heidhues, Paul & Rady, Sven & Strack, Philipp, 2015. "Strategic experimentation with private payoffs," Journal of Economic Theory, Elsevier, vol. 159(PA), pages 531-551.
    2. Rosenberg, Dinah & Salomon, Antoine & Vieille, Nicolas, 2013. "On games of strategic experimentation," Games and Economic Behavior, Elsevier, vol. 82(C), pages 31-51.
    3. Kaustav Das & Nicolas Klein & Katharina Schmid, 2020. "Strategic experimentation with asymmetric players," Economic Theory, Springer;Society for the Advancement of Economic Theory (SAET), vol. 69(4), pages 1147-1175, June.
    4. Renault, Jérôme & Solan, Eilon & Vieille, Nicolas, 0. "Strategic experimentation with privately observed payoffs," Theoretical Economics, Econometric Society.
    5. Thomas, Caroline, 2019. "Experimentation with reputation concerns – Dynamic signalling with changing types," Journal of Economic Theory, Elsevier, vol. 179(C), pages 366-415.
    6. Chen, Chia-Hui & Ishida, Junichiro, 2018. "Hierarchical experimentation," Journal of Economic Theory, Elsevier, vol. 177(C), pages 365-404.
    7. Bloch, Francis & Fabrizi, Simona & Lippert, Steffen, 2022. "Hiding and herding in market entry," Journal of Economic Theory, Elsevier, vol. 206(C).
    8. Alessandro Lizzeri & Eran Shmaya & Leeat Yariv, 2024. "Disentangling Exploration from Exploitation," NBER Working Papers 32424, National Bureau of Economic Research, Inc.
    9. Klein, Nicolas, 2013. "Strategic learning in teams," Games and Economic Behavior, Elsevier, vol. 82(C), pages 636-657.
    10. Mira Frick & Yuhta Ishii, 2015. "Innovation Adoption by Forward-Looking Social Learners," Cowles Foundation Discussion Papers 1877, Cowles Foundation for Research in Economics, Yale University.
    11. Keller, Godfrey & Rady, Sven, 2015. "Breakdowns," Theoretical Economics, Econometric Society, vol. 10(1), January.
    12. Wagner, Peter A. & Klein, Nicolas, 2022. "Strategic investment and learning with private information," Journal of Economic Theory, Elsevier, vol. 204(C).
    13. Simina Br^anzei & Yuval Peres, 2019. "Multiplayer Bandit Learning, from Competition to Cooperation," Papers 1908.01135, arXiv.org, revised Jan 2024.
    14. Chen, Chia-Hui & Ishida, Junichiro & Mukherjee, Arijit, 2023. "Pioneer, early follower or late entrant: Entry dynamics with learning and market competition," European Economic Review, Elsevier, vol. 152(C).
    15. Marlats, Chantal & Ménager, Lucie, 2021. "Strategic observation with exponential bandits," Journal of Economic Theory, Elsevier, vol. 193(C).
    16. Keller, Godfrey & Novák, Vladimír & Willems, Tim, 2019. "A note on optimal experimentation under risk aversion," Journal of Economic Theory, Elsevier, vol. 179(C), pages 476-487.
    17. Thomas Greve & Hans Keiding, 2023. "A model of privately funded public research," Journal of Economics, Springer, vol. 140(1), pages 63-91, September.
    18. Nicolas Klein & Tymofiy Mylovanov, 2011. "Should the Flatterers be Avoided?," 2011 Meeting Papers 1273, Society for Economic Dynamics.
    19. Besanko, David & Tong, Jian & Wu, Jianjun, 2016. "Subsidizing research programs with "if" and "when" uncertainty in the face of severe informational constraints," Discussion Paper Series In Economics And Econometrics 1605, Economics Division, School of Social Sciences, University of Southampton.
    20. Sorensen, Morten, 2007. "Learning by Investing: Evidence from Venture Capital," SIFR Research Report Series 53, Institute for Financial Research.

    More about this item

    Keywords

    ;
    ;
    ;
    ;
    ;
    ;
    ;

    Statistics

    Access and download statistics

    Corrections

    All material on this site has been provided by the respective publishers and authors. You can help correct errors and omissions. When requesting a correction, please mention this item's handle: RePEc:eee:dyncon:v:184:y:2026:i:c:s0165188926000102. See general information about how to correct material in RePEc.

    If you have authored this item and are not yet registered with RePEc, we encourage you to do it here. This allows to link your profile to this item. It also allows you to accept potential citations to this item that we are uncertain about.

    If CitEc recognized a bibliographic reference but did not link an item in RePEc to it, you can help with this form .

    If you know of missing items citing this one, you can help us creating those links by adding the relevant references in the same way as above, for each refering item. If you are a registered author of this item, you may also want to check the "citations" tab in your RePEc Author Service profile, as there may be some citations waiting for confirmation.

    For technical questions regarding this item, or to correct its authors, title, abstract, bibliographic or download information, contact: Catherine Liu (email available below). General contact details of provider: http://www.elsevier.com/locate/jedc .

    Please note that corrections may take a couple of weeks to filter through the various RePEc services.

    IDEAS is a RePEc service. RePEc uses bibliographic data supplied by the respective publishers.