Robust Experimentation in the Continuous Time Bandit Problem

My bibliography Save this paper

Robust Experimentation in the Continuous Time Bandit Problem

Author

Listed:

Farzad Pourbabaee

Registered:

Abstract

We study the experimentation dynamics of a decision maker (DM) in a two-armed bandit setup (Bolton and Harris (1999)), where the agent holds ambiguous beliefs regarding the distribution of the return process of one arm and is certain about the other one. The DM entertains Multiplier preferences a la Hansen and Sargent (2001), thus we frame the decision making environment as a two-player differential game against nature in continuous time. We characterize the DM value function and her optimal experimentation strategy that turns out to follow a cut-off rule with respect to her belief process. The belief threshold for exploring the ambiguous arm is found in closed form and is shown to be increasing with respect to the ambiguity aversion index. We then study the effect of provision of an unambiguous information source about the ambiguous arm. Interestingly, we show that the exploration threshold rises unambiguously as a result of this new information source, thereby leading to more conservatism. This analysis also sheds light on the efficient time to reach for an expert opinion.

Suggested Citation

Farzad Pourbabaee, 2021. "Robust Experimentation in the Continuous Time Bandit Problem," Papers 2104.00102, arXiv.org.

Handle: RePEc:arx:papers:2104.00102

Download full text from publisher

References listed on IDEAS

Frank Riedel, 2009. "Optimal Stopping With Multiple Priors," Econometrica, Econometric Society, vol. 77(3), pages 857-908, May.
Gustavo Manso, 2011. "Motivating Innovation," Journal of Finance, American Finance Association, vol. 66(5), pages 1823-1860, October.
Bonatti, Alessandro & Hörner, Johannes, 2017. "Learning to disagree in a game of experimentation," Journal of Economic Theory, Elsevier, vol. 169(C), pages 234-269.
- Alessandro Bonatti & Johannes Horner, 2015. "Learning to Disagree in a Game of Experimentation," Cowles Foundation Discussion Papers 1991, Cowles Foundation for Research in Economics, Yale University.
- Bonatti, Alessandro & Hörner, Johannes, 2017. "Learning to Disagree in a Game of Experimentation," TSE Working Papers 17-791, Toulouse School of Economics (TSE).
Godfrey Keller & Sven Rady, 1999. "Optimal Experimentation in a Changing Environment," The Review of Economic Studies, Review of Economic Studies Ltd, vol. 66(3), pages 475-507.
- Godfrey Keller & Sven Rady, 1997. "Optimal Experimentation in a Changing Environment," STICERD - Theoretical Economics Paper Series 333, Suntory and Toyota International Centres for Economics and Related Disciplines, LSE.
- Godfrey Keller & Sven Rady, 1998. "Optimal Experimentation in a Changing Environment," Game Theory and Information 9801001, University Library of Munich, Germany.
Weitzman, Martin L, 1979. "Optimal Search for the Best Alternative," Econometrica, Econometric Society, vol. 47(3), pages 641-654, May.
- M. L. Weitzman, 1978. "Optimal Search for the Best Alternative," Working papers 214, Massachusetts Institute of Technology (MIT), Department of Economics.
Fabio Maccheroni & Massimo Marinacci & Aldo Rustichini, 2006. "Ambiguity Aversion, Robustness, and the Variational Representation of Preferences," Econometrica, Econometric Society, vol. 74(6), pages 1447-1498, November.
- Fabio Maccheroni & Massimo Marinacci & Aldo Rustichini, 2004. "Ambiguity Aversion, Robustness, and the Variational Representation of Preferences," Carlo Alberto Notebooks 12, Collegio Carlo Alberto, revised 2006.
Lars Peter Hansen & Thomas J Sargent, 2014. "Robust Control and Model Uncertainty," World Scientific Book Chapters, in: UNCERTAINTY WITHIN ECONOMIC MODELS, chapter 5, pages 145-154, World Scientific Publishing Co. Pte. Ltd..
- Thomas J. Sargent & LarsPeter Hansen, 2001. "Robust Control and Model Uncertainty," American Economic Review, American Economic Association, vol. 91(2), pages 60-66, May.
Godfrey Keller & Sven Rady & Martin Cripps, 2005. "Strategic Experimentation with Exponential Bandits," Econometrica, Econometric Society, vol. 73(1), pages 39-68, January.
- Rady, Sven & Cripps, Martin William & Keller, R Godfrey, 2003. "Strategic Experimentation with Exponential Bandits," CEPR Discussion Papers 3814, C.E.P.R. Discussion Papers.
- Cripps, Martin & Keller, Godfrey & Rady, Sven, 2003. "Strategic Experimentation with Exponential Bandits," Discussion Papers in Economics 4, University of Munich, Department of Economics.
- Godfrey Keller & Martin Cripps & Olin School of Business & Washington University & Sven Rady & Department of Economics & University of Munich, 2003. "Strategic Experimentation with Exponential Bandits," Economics Series Working Papers 143, University of Oxford, Department of Economics.
Yaoyao Wu & Jinqiang Yang & Zhentao Zou, 2018. "Ambiguity sharing and the lack of relative performance evaluation," Economic Theory, Springer;Society for the Advancement of Economic Theory (SAET), vol. 66(1), pages 141-157, July.
Jianjun Miao & Alejandro Rivera, 2016. "Robust Contracts in Continuous Time," Econometrica, Econometric Society, vol. 84, pages 1405-1440, July.
- Jianjun Miao & Alejandro Rivera, 2016. "Robust Contracts in Continuous Time," Econometrica, Econometric Society, vol. 84(4), pages 1405-1440, July.
- Jianjun Miao & Alejandro Rivera, 2013. "Robust Contracts in Continuous Time," Boston University - Department of Economics - Working Papers Series 2013-009, Boston University - Department of Economics.
Epstein, Larry G. & Schneider, Martin, 2003. "Recursive multiple-priors," Journal of Economic Theory, Elsevier, vol. 113(1), pages 1-31, November.
- Larry G. Epstein & Martin Schneider, 2001. "Recursive Multiple-Priors," RCER Working Papers 485, University of Rochester - Center for Economic Research (RCER).
Lars Peter Hansen & Thomas J Sargent, 2014. "Robust Control and Model Misspecification," World Scientific Book Chapters, in: UNCERTAINTY WITHIN ECONOMIC MODELS, chapter 6, pages 155-216, World Scientific Publishing Co. Pte. Ltd..
- Hansen, Lars Peter & Sargent, Thomas J. & Turmuhambetova, Gauhar & Williams, Noah, 2006. "Robust control and model misspecification," Journal of Economic Theory, Elsevier, vol. 128(1), pages 45-90, May.
Larry G. Epstein & Martin Schneider, 2007. "Learning Under Ambiguity," The Review of Economic Studies, Review of Economic Studies Ltd, vol. 74(4), pages 1275-1303.
- Larry Epstein & Martin Schneider, 2002. "Learning Under Ambiguity," RCER Working Papers 497, University of Rochester - Center for Economic Research (RCER), revised Mar 2005.
- Larry Epstein & Martin Schneider, 2006. "Learning Under Ambiguity," RCER Working Papers 527, University of Rochester - Center for Economic Research (RCER).
Patrick Bolton & Christopher Harris, 1999. "Strategic Experimentation," Econometrica, Econometric Society, vol. 67(2), pages 349-374, March.
Christopher Anderson, 2012. "Ambiguity aversion in multi-armed bandit problems," Theory and Decision, Springer, vol. 72(1), pages 15-33, January.
Robert J. Meyer & Yong Shi, 1995. "Sequential Choice Under Ambiguity: Intuitive Solutions to the Armed-Bandit Problem," Management Science, INFORMS, vol. 41(5), pages 817-834, May.
Hansen, Lars Peter & Sargent, Thomas J., 2011. "Robustness and ambiguity in continuous time," Journal of Economic Theory, Elsevier, vol. 146(3), pages 1195-1223, May.
Heidhues, Paul & Rady, Sven & Strack, Philipp, 2015. "Strategic experimentation with private payoffs," Journal of Economic Theory, Elsevier, vol. 159(PA), pages 531-551.
- Heidhues, Paul & Rady, Sven & Strack, Philipp, 2012. "Strategic Experimentation with Private Payoffs," Discussion Paper Series of SFB/TR 15 Governance and the Efficiency of Economic Systems 387, Free University of Berlin, Humboldt University of Berlin, University of Bonn, University of Mannheim, University of Munich.
- Rady, Sven & Heidhues, Paul & Strack, Philipp, 2015. "Strategic Experimentation with Private Payoffs," CEPR Discussion Papers 10634, C.E.P.R. Discussion Papers.
Maccheroni, Fabio & Marinacci, Massimo & Rustichini, Aldo, 2006. "Dynamic variational preferences," Journal of Economic Theory, Elsevier, vol. 128(1), pages 4-44, May.
- Fabio Maccheroni & Massimo Marinacci & Aldo Rustichini, 2006. "Dynamic Variational Preferences," Carlo Alberto Notebooks 1, Collegio Carlo Alberto.
Larry G. Epstein & Shaolin Ji, 2022. "Optimal Learning Under Robustness and Time-Consistency," Operations Research, INFORMS, vol. 70(3), pages 1317-1329, May.
- Larry G. Epstein & Shaolin Ji, 2017. "Optimal Learning under Robustness and Time-Consistency," Papers 1708.01890, arXiv.org, revised Mar 2019.
Gilboa, Itzhak & Schmeidler, David, 1989. "Maxmin expected utility with non-unique prior," Journal of Mathematical Economics, Elsevier, vol. 18(2), pages 141-153, April.
- Gilboa, Itzhak & Schmeidler, David, 1986. "Maxmin Expected Utility with a Non-Unique Prior," Foerder Institute for Economic Research Working Papers 275405, Tel-Aviv University > Foerder Institute for Economic Research.
- Itzhak Gilboa & David Schmeidler, 1989. "Maxmin Expected Utility with Non-Unique Prior," Post-Print hal-00753237, HAL.
Massimo Marinacci, 2002. "Learning from ambiguous urns," Statistical Papers, Springer, vol. 43(1), pages 143-151, January.
Li, Jian, 2019. "The K-armed bandit problem with multiple priors," Journal of Mathematical Economics, Elsevier, vol. 80(C), pages 22-38.
Yulei Luo, 2017. "Robustly Strategic Consumption–Portfolio Rules with Informational Frictions," Management Science, INFORMS, vol. 63(12), pages 4158-4174, December.
- Luo, Yulei, 2015. "Robustly Strategic Consumption-Portfolio Rules with Informational Frictions," MPRA Paper 64312, University Library of Munich, Germany.

Full references (including those not matched with items on IDEAS)

Most related items

These are the items that most often cite the same works as this one and are cited by the same works as this one.

Farzad Pourbabaee, 2022. "Robust experimentation in the continuous time bandit problem," Economic Theory, Springer;Society for the Advancement of Economic Theory (SAET), vol. 73(1), pages 151-181, February.
Li, Jian, 2019. "The K-armed bandit problem with multiple priors," Journal of Mathematical Economics, Elsevier, vol. 80(C), pages 22-38.
Hansen, Peter G., 2022. "New formulations of ambiguous volatility with an application to optimal dynamic contracting," Journal of Economic Theory, Elsevier, vol. 199(C).
Li, Jing, 2018. "Essays on model uncertainty in financial models," Other publications TiSEM 202cd910-7ef1-4db4-94ae-d, Tilburg University, School of Economics and Management.
Hansen, Lars Peter & Sargent, Thomas J., 2022. "Structured ambiguity and model misspecification," Journal of Economic Theory, Elsevier, vol. 199(C).
Hansen, Lars Peter & Szőke, Bálint & Han, Lloyd S. & Sargent, Thomas J., 2020. "Twisted probabilities, uncertainty, and prices," Journal of Econometrics, Elsevier, vol. 216(1), pages 151-174.
Chambers, Robert G. & Melkonyan, Tigran, 2009. "Smoothing preference kinks with information," Mathematical Social Sciences, Elsevier, vol. 58(2), pages 173-189, September.
Peter G. Hansen, 2021. "New Formulations of Ambiguous Volatility with an Application to Optimal Dynamic Contracting," Papers 2101.12306, arXiv.org.
Swagata Bhattacharjee, 2019. "Dynamic Contracting for Innovation Under Ambiguity," Working Papers 1022, Ashoka University, Department of Economics, revised Aug 2019.
Battigalli, P. & Francetich, A. & Lanzani, G. & Marinacci, M., 2019. "Learning and self-confirming long-run biases," Journal of Economic Theory, Elsevier, vol. 183(C), pages 740-785.
Daniele Pennesi, 2013. "Asset Prices in an Ambiguous Economy," Carlo Alberto Notebooks 315, Collegio Carlo Alberto.
Paul Viefers, 2012. "Should I Stay or Should I Go?: A Laboratory Analysis of Investment Opportunities under Ambiguity," Discussion Papers of DIW Berlin 1228, DIW Berlin, German Institute for Economic Research.
Bhattacharjee, Swagata, 2022. "Dynamic contracting for innovation under ambiguity," Games and Economic Behavior, Elsevier, vol. 132(C), pages 534-552.
- Swagata Bhattacharjee, 2019. "Dynamic Contracting for Innovation Under Ambiguity," Working Papers 15, Ashoka University, Department of Economics, revised 02 Aug 2019.
Li, Jian & Zhou, Junjie, 2016. "Blackwell's informativeness ranking with uncertainty-averse preferences," Games and Economic Behavior, Elsevier, vol. 96(C), pages 18-29.
Alexander Zimper, 2011. "Do Bayesians Learn Their Way Out of Ambiguity?," Decision Analysis, INFORMS, vol. 8(4), pages 269-285, December.
- Alexander Zimper, 2011. "Do Bayesians learn their way out of ambiguity?," Working Papers 240, Economic Research Southern Africa.
Michael Barnett & Greg Buchak & Constantine Yannelis, 2023. "Epidemic responses under uncertainty," Proceedings of the National Academy of Sciences, Proceedings of the National Academy of Sciences, vol. 120(2), pages 2208111120-, January.
- Michael Barnett & Greg Buchak & Constantine Yannelis, 2020. "Epidemic Responses Under Uncertainty," NBER Working Papers 27289, National Bureau of Economic Research, Inc.
- Michael Barnett & Greg Buchak & Constantine Yannelis, 2020. "Epidemic Responses Under Uncertainty," Working Papers 2020-72, Becker Friedman Institute for Research In Economics.
Agarwal, Vikas & Arisoy, Y. Eser & Naik, Narayan Y., 2017. "Volatility of aggregate volatility and hedge fund returns," Journal of Financial Economics, Elsevier, vol. 125(3), pages 491-510.
- Vikas Agarwal & Eser Arisoy & Narayan y Naik, 2015. "Volatility of Aggregate Volatility and Hedge Fund Returns," Post-Print hal-01412976, HAL.
- Vikas Agarwal & Eser Arisoy & Narayan Y. Naik, 2017. "Volatility of Aggregate Volatility and Hedge Fund Returns," Post-Print hal-01634155, HAL.
- Agarwal, Vikas & Arisoy, Y. Eser & Naik, Narayan Y., 2015. "Volatility of aggregate volatility and hedge funds returns," CFR Working Papers 15-03, University of Cologne, Centre for Financial Research (CFR).
- Agarwal, Vikas & Arisoy, Y. Eser & Naik, Narayan Y., 2015. "Volatility of aggregate volatility and hedge funds returns," CFR Working Papers 15-03 [rev.], University of Cologne, Centre for Financial Research (CFR).
Massimo Guidolin & Francesca Rinaldi, 2013. "Ambiguity in asset pricing and portfolio choice: a review of the literature," Theory and Decision, Springer, vol. 74(2), pages 183-217, February.
- Massimo Guidolin & Francesca Rinaldi, 2010. "Ambiguity in asset pricing and portfolio choice: a review of the literature," Working Papers 2010-028, Federal Reserve Bank of St. Louis.
- Massimo Guidolin & Francesca Rinaldi, 2011. "Ambiguity in Asset Pricing and Portfolio Choice: A Review of the Literature," Working Papers 417, IGIER (Innocenzo Gasparini Institute for Economic Research), Bocconi University.
Nengjiu Ju & Jianjun Miao, 2012. "Ambiguity, Learning, and Asset Returns," Econometrica, Econometric Society, vol. 80(2), pages 559-591, March.
- Nengjiu Ju & Jianjun Miao, "undated". "Ambiguity, Learning, and Asset Returns," Boston University - Department of Economics - Working Papers Series wp2009-014, Boston University - Department of Economics.
- Jianjun Miao & NENGJIU JU, 2010. "Ambiguity, Learning, And Asset Returns," Boston University - Department of Economics - Working Papers Series WP2010-031, Boston University - Department of Economics.
- Ju, Nengjiu & Miao, Jianjun, 2009. "Ambiguity, Learning, and Asset Returns," MPRA Paper 14737, University Library of Munich, Germany, revised Apr 2009.
- Nengjiu Ju & Jianjun Miao, 2010. "Ambiguity, Learning, and Asset Returns," CEMA Working Papers 438, China Economics and Management Academy, Central University of Finance and Economics.
Bommier, Antoine & Kochov, Asen & Le Grand, François, 2019. "Ambiguity and endogenous discounting," Journal of Mathematical Economics, Elsevier, vol. 83(C), pages 48-62.
- Antoine Bommier & Asen Kochov & François Le Grand, 2019. "Ambiguity and endogenous discounting," Post-Print hal-02312365, HAL.

More about this item

NEP fields

This paper has been announced in the following NEP Reports:

NEP-EXP-2021-04-05 (Experimental Economics)
NEP-GTH-2021-04-05 (Game Theory)
NEP-MIC-2021-04-05 (Microeconomics)

Statistics

Access and download statistics

Corrections

All material on this site has been provided by the respective publishers and authors. You can help correct errors and omissions. When requesting a correction, please mention this item's handle: RePEc:arx:papers:2104.00102. See general information about how to correct material in RePEc.

If you have authored this item and are not yet registered with RePEc, we encourage you to do it here. This allows to link your profile to this item. It also allows you to accept potential citations to this item that we are uncertain about.

If CitEc recognized a bibliographic reference but did not link an item in RePEc to it, you can help with this form .

If you know of missing items citing this one, you can help us creating those links by adding the relevant references in the same way as above, for each refering item. If you are a registered author of this item, you may also want to check the "citations" tab in your RePEc Author Service profile, as there may be some citations waiting for confirmation.

For technical questions regarding this item, or to correct its authors, title, abstract, bibliographic or download information, contact: arXiv administrators (email available below). General contact details of provider: http://arxiv.org/ .

Please note that corrections may take a couple of weeks to filter through the various RePEc services.

IDEAS is a RePEc service. RePEc uses bibliographic data supplied by the respective publishers.

Browse Econ Literature

More features

Robust Experimentation in the Continuous Time Bandit Problem

Author

Abstract

Suggested Citation

Download full text from publisher

References listed on IDEAS

Most related items

More about this item

NEP fields

Statistics

Corrections

More services and features

MyIDEAS

Author registration

Rankings

RePEc Genealogy

RePEc Biblio

MPRA

New papers by email

EconAcademics

Plagiarism

About RePEc

RePEc home

Blog

Help/FAQ

RePEc team

Participating archives

Privacy statement

Help us

Corrections

Volunteers

Get papers listed

Open a RePEc archive

Get RePEc data