Robust Multiarmed Bandit Problems
Author
Abstract
Suggested Citation
DOI: 10.1287/mnsc.2015.2153
Download full text from publisher
References listed on IDEAS
- Hansen, Lars Peter & Sargent, Thomas J., 2007.
"Recursive robust estimation and control without commitment,"
Journal of Economic Theory, Elsevier, vol. 136(1), pages 1-27, September.
- Hansen, Lars Peter & Sargent, Thomas J., 2005. "Recursive robust estimation and control without commitment," Discussion Paper Series 1: Economic Studies 2005,28, Deutsche Bundesbank.
- Carri W. Chan & Vivek F. Farias, 2009. "Stochastic Depletion Problems: Effective Myopic Policies for a Class of Dynamic Optimization Problems," Mathematics of Operations Research, INFORMS, vol. 34(2), pages 333-350, May.
- Larry G. Epstein & Martin Schneider, 2007.
"Learning Under Ambiguity,"
The Review of Economic Studies, Review of Economic Studies Ltd, vol. 74(4), pages 1275-1303.
- Larry Epstein & Martin Schneider, 2002. "Learning Under Ambiguity," RCER Working Papers 497, University of Rochester - Center for Economic Research (RCER), revised Mar 2005.
- Larry Epstein & Martin Schneider, 2006. "Learning Under Ambiguity," RCER Working Papers 527, University of Rochester - Center for Economic Research (RCER).
- Andrea Grove & Gary A. Berg (ed.), 2014. "Social Business," Springer Books, Springer, edition 127, number 978-3-642-45275-8, January.
- Wolfram Wiesemann & Daniel Kuhn & Berç Rustem, 2013. "Robust Markov Decision Processes," Mathematics of Operations Research, INFORMS, vol. 38(1), pages 153-183, February.
- ,, 2000. "Problems And Solutions," Econometric Theory, Cambridge University Press, vol. 16(2), pages 287-299, April.
- David B. Brown & James E. Smith & Peng Sun, 2010. "Information Relaxations and Duality in Stochastic Dynamic Programs," Operations Research, INFORMS, vol. 58(4-part-1), pages 785-801, August.
- Garud N. Iyengar, 2005. "Robust Dynamic Programming," Mathematics of Operations Research, INFORMS, vol. 30(2), pages 257-280, May.
- A. Ben-Tal & A. Nemirovski, 1998. "Robust Convex Optimization," Mathematics of Operations Research, INFORMS, vol. 23(4), pages 769-805, November.
- Dimitris Bertsimas & Melvyn Sim, 2004. "The Price of Robustness," Operations Research, INFORMS, vol. 52(1), pages 35-53, February.
- Andrew E. B. Lim & J. George Shanthikumar, 2007. "Relative Entropy, Exponential Utility, and Robust Dynamic Pricing," Operations Research, INFORMS, vol. 55(2), pages 198-214, April.
- David B. Brown & James E. Smith, 2013. "Optimal Sequential Exploration: Bandits, Clairvoyants, and Wildcats," Operations Research, INFORMS, vol. 61(3), pages 644-665, June.
- Epstein, Larry G. & Schneider, Martin, 2003.
"Recursive multiple-priors,"
Journal of Economic Theory, Elsevier, vol. 113(1), pages 1-31, November.
- Larry G. Epstein & Martin Schneider, 2001. "Recursive Multiple-Priors," RCER Working Papers 485, University of Rochester - Center for Economic Research (RCER).
- Arnab Nilim & Laurent El Ghaoui, 2005. "Robust Control of Markov Decision Processes with Uncertain Transition Matrices," Operations Research, INFORMS, vol. 53(5), pages 780-798, October.
- Felipe Caro & Jérémie Gallien, 2007. "Dynamic Assortment with Demand Learning for Seasonal Consumer Goods," Management Science, INFORMS, vol. 53(2), pages 276-292, February.
- Hansen, Lars Peter & Sargent, Thomas J., 2005. "Robust estimation and control under commitment," Journal of Economic Theory, Elsevier, vol. 124(2), pages 258-301, October.
- Jonathan Li & Roy Kwon, 2013. "Portfolio selection under model uncertainty: a penalized moment-based optimization approach," Journal of Global Optimization, Springer, vol. 56(1), pages 131-164, May.
- Andrew E. B. Lim & J. George Shanthikumar & Gah-Yi Vahn, 2012. "Robust Portfolio Choice with Learning in the Framework of Regret: Single-Period Case," Management Science, INFORMS, vol. 58(9), pages 1732-1746, September.
- Paat Rusmevichientong & Zuo-Jun Max Shen & David B. Shmoys, 2010. "Dynamic Assortment Optimization with a Multinomial Logit Choice Model and Capacity Constraint," Operations Research, INFORMS, vol. 58(6), pages 1666-1680, December.
- Dimitris Bertsimas & José Niño-Mora, 1996. "Conservation Laws, Extended Polymatroids and Multiarmed Bandit Problems; A Polyhedral Approach to Indexable Systems," Mathematics of Operations Research, INFORMS, vol. 21(2), pages 257-306, May.
- Ilya O. Ryzhov & Warren B. Powell & Peter I. Frazier, 2012. "The Knowledge Gradient Algorithm for a General Class of Online Learning Problems," Operations Research, INFORMS, vol. 60(1), pages 180-195, February.
Citations
Citations are extracted by the CitEc Project, subscribe to its RSS feed for this item.
Cited by:
- Nikolsko-Rzhevskyy, Alex & Papell, David H. & Prodan, Ruxandra, 2017. "The Yellen rules," Journal of Macroeconomics, Elsevier, vol. 54(PA), pages 59-71.
- Xikui Wang & You Liang & Lysa Porth, 2019. "A Bayesian two‐armed bandit model," Applied Stochastic Models in Business and Industry, John Wiley & Sons, vol. 35(3), pages 624-636, May.
- Michael Jong Kim, 2020. "Variance Regularization in Sequential Bayesian Optimization," Mathematics of Operations Research, INFORMS, vol. 45(3), pages 966-992, August.
- Xu, Jianyu & Chen, Lujie & Tang, Ou, 2021. "An online algorithm for the risk-aware restless bandit," European Journal of Operational Research, Elsevier, vol. 290(2), pages 622-639.
- Keskin, Burcu B. & Griffin, Emily C. & Prell, Jonathan O. & Dilkina, Bistra & Ferber, Aaron & MacDonald, John & Hilend, Rowan & Griffis, Stanley & Gore, Meredith L., 2023. "Quantitative Investigation of Wildlife Trafficking Supply Chains: A Review," Omega, Elsevier, vol. 115(C).
- David B. Brown & Martin B. Haugh, 2017. "Information Relaxation Bounds for Infinite Horizon Markov Decision Processes," Operations Research, INFORMS, vol. 65(5), pages 1355-1379, October.
- Eisenberg, Julia & Krühner, Paul, 2018. "The impact of negative interest rates on optimal capital injections," Insurance: Mathematics and Economics, Elsevier, vol. 82(C), pages 1-10.
- Malekipirbazari, Milad & Çavuş, Özlem, 2024. "Index policy for multiarmed bandit problem with dynamic risk measures," European Journal of Operational Research, Elsevier, vol. 312(2), pages 627-640.
- Flores-Szwagrzak, Karol, 2022. "Learning by Convex Combination," Working Papers 16-2022, Copenhagen Business School, Department of Economics.
Most related items
These are the items that most often cite the same works as this one and are cited by the same works as this one.- Michael Jong Kim, 2016. "Robust Control of Partially Observable Failing Systems," Operations Research, INFORMS, vol. 64(4), pages 999-1014, August.
- Shie Mannor & Ofir Mebel & Huan Xu, 2016. "Robust MDPs with k -Rectangular Uncertainty," Mathematics of Operations Research, INFORMS, vol. 41(4), pages 1484-1509, November.
- Bren, Austin & Saghafian, Soroush, 2018. "Data-Driven Percentile Optimization for Multi-Class Queueing Systems with Model Ambiguity: Theory and Application," Working Paper Series rwp18-008, Harvard University, John F. Kennedy School of Government.
- Rasouli, Mohammad & Saghafian, Soroush, 2018. "Robust Partially Observable Markov Decision Processes," Working Paper Series rwp18-027, Harvard University, John F. Kennedy School of Government.
- Saghafian, Soroush, 2018. "Ambiguous partially observable Markov decision processes: Structural results and applications," Journal of Economic Theory, Elsevier, vol. 178(C), pages 1-35.
- Andrew J. Keith & Darryl K. Ahner, 2021. "A survey of decision making and optimization under uncertainty," Annals of Operations Research, Springer, vol. 300(2), pages 319-353, May.
- Bakker, Hannah & Dunke, Fabian & Nickel, Stefan, 2020. "A structuring review on multi-stage optimization under uncertainty: Aligning concepts from theory and practice," Omega, Elsevier, vol. 96(C).
- Maximilian Blesch & Philipp Eisenhauer, 2021. "Robust decision-making under risk and ambiguity," Papers 2104.12573, arXiv.org, revised Oct 2021.
- Hansen, Lars Peter & Mayer, Ricardo & Sargent, Thomas, 2010. "Robust hidden Markov LQG problems," Journal of Economic Dynamics and Control, Elsevier, vol. 34(10), pages 1951-1966, October.
- Jose Blanchet & Karthyek Murthy, 2019. "Quantifying Distributional Model Risk via Optimal Transport," Mathematics of Operations Research, INFORMS, vol. 44(2), pages 565-600, May.
- Nicole Bauerle & Alexander Glauner, 2020. "Distributionally Robust Markov Decision Processes and their Connection to Risk Measures," Papers 2007.13103, arXiv.org.
- Xin, Linwei & Goldberg, David A., 2021. "Time (in)consistency of multistage distributionally robust inventory models with moment constraints," European Journal of Operational Research, Elsevier, vol. 289(3), pages 1127-1141.
- Wei, Cansheng & Li, Yongjian & Cai, Xiaoqiang, 2011. "Robust optimal policies of production and inventory with uncertain returns and demand," International Journal of Production Economics, Elsevier, vol. 134(2), pages 357-367, December.
- Alexander Shapiro, 2016. "Rectangular Sets of Probability Measures," Operations Research, INFORMS, vol. 64(2), pages 528-541, April.
- Andrew E. B. Lim & J. George Shanthikumar, 2007. "Relative Entropy, Exponential Utility, and Robust Dynamic Pricing," Operations Research, INFORMS, vol. 55(2), pages 198-214, April.
- Alper Atamtürk & Muhong Zhang, 2007. "Two-Stage Robust Network Flow and Design Under Demand Uncertainty," Operations Research, INFORMS, vol. 55(4), pages 662-673, August.
- Fernandes, Betina & Street, Alexandre & Valladão, Davi & Fernandes, Cristiano, 2016. "An adaptive robust portfolio optimization model with loss constraints based on data-driven polyhedral uncertainty sets," European Journal of Operational Research, Elsevier, vol. 255(3), pages 961-970.
- Erim Kardeş & Fernando Ordóñez & Randolph W. Hall, 2011. "Discounted Robust Stochastic Games and an Application to Queueing Control," Operations Research, INFORMS, vol. 59(2), pages 365-382, April.
- Wenqing Chen & Melvyn Sim & Jie Sun & Chung-Piaw Teo, 2010. "From CVaR to Uncertainty Set: Implications in Joint Chance-Constrained Optimization," Operations Research, INFORMS, vol. 58(2), pages 470-485, April.
- Stefan Mišković, 2017. "A VNS-LP algorithm for the robust dynamic maximal covering location problem," OR Spectrum: Quantitative Approaches in Management, Springer;Gesellschaft für Operations Research e.V., vol. 39(4), pages 1011-1033, October.
More about this item
Keywords
bandit problems; robust control; model uncertainty; relative entropy; games against nature;All these keywords.
Statistics
Access and download statisticsCorrections
All material on this site has been provided by the respective publishers and authors. You can help correct errors and omissions. When requesting a correction, please mention this item's handle: RePEc:inm:ormnsc:v:62:y:2016:i:1:p:264-285. See general information about how to correct material in RePEc.
If you have authored this item and are not yet registered with RePEc, we encourage you to do it here. This allows to link your profile to this item. It also allows you to accept potential citations to this item that we are uncertain about.
If CitEc recognized a bibliographic reference but did not link an item in RePEc to it, you can help with this form .
If you know of missing items citing this one, you can help us creating those links by adding the relevant references in the same way as above, for each refering item. If you are a registered author of this item, you may also want to check the "citations" tab in your RePEc Author Service profile, as there may be some citations waiting for confirmation.
For technical questions regarding this item, or to correct its authors, title, abstract, bibliographic or download information, contact: Chris Asher (email available below). General contact details of provider: https://edirc.repec.org/data/inforea.html .
Please note that corrections may take a couple of weeks to filter through the various RePEc services.