Risk and optimal policies in bandit experiments

My bibliography Save this paper

Risk and optimal policies in bandit experiments

Author

Listed:

Karun Adusumilli

Registered:

Abstract

We provide a decision theoretic analysis of bandit experiments under local asymptotics. Working within the framework of diffusion processes, we define suitable notions of asymptotic Bayes and minimax risk for these experiments. For normally distributed rewards, the minimal Bayes risk can be characterized as the solution to a second-order partial differential equation (PDE). Using a limit of experiments approach, we show that this PDE characterization also holds asymptotically under both parametric and non-parametric distributions of the rewards. The approach further describes the state variables it is asymptotically sufficient to restrict attention to, and thereby suggests a practical strategy for dimension reduction. The PDEs characterizing minimal Bayes risk can be solved efficiently using sparse matrix routines or Monte-Carlo methods. We derive the optimal Bayes and minimax policies from their numerical solutions. These optimal policies substantially dominate existing methods such as Thompson sampling; the risk of the latter is often twice as high.

Suggested Citation

Karun Adusumilli, 2021. "Risk and optimal policies in bandit experiments," Papers 2112.06363, arXiv.org, revised Jan 2024.

Handle: RePEc:arx:papers:2112.06363

Download full text from publisher

References listed on IDEAS

Keisuke Hirano & Jack R. Porter, 2009. "Asymptotics for Statistical Treatment Rules," Econometrica, Econometric Society, vol. 77(5), pages 1683-1701, September.
- Hirano, Keisuke & Porter, Jack, 2006. "Asymptotics for statistical treatment rules," MPRA Paper 1173, University Library of Munich, Germany.
Yves Achdou & Jiequn Han & Jean-Michel Lasry & Pierre-Louis Lions & Benjamin Moll, 2017. "Income and Wealth Distribution in Macroeconomics: A Continuous-Time Approach," NBER Working Papers 23732, National Bureau of Economic Research, Inc.
Maximilian Kasy & Anja Sautmann, 2021. "Adaptive Treatment Assignment in Experiments for Policy Choice," Econometrica, Econometric Society, vol. 89(1), pages 113-132, January.
- Maximilian Kasy & Anja Sautmann, 2019. "Adaptive Treatment Assignment in Experiments for Policy Choice," CESifo Working Paper Series 7778, CESifo.
Rothschild, Michael, 1974. "A two-armed bandit theory of market pricing," Journal of Economic Theory, Elsevier, vol. 9(2), pages 185-202, October.

Full references (including those not matched with items on IDEAS)

Citations

Citations are extracted by the CitEc Project, subscribe to its RSS feed for this item.

Cited by:

Keisuke Hirano & Jack R. Porter, 2023. "Asymptotic Representations for Sequential Decisions, Adaptive Experiments, and Batched Bandits," Papers 2302.03117, arXiv.org.
Masahiro Kato & Masaaki Imaizumi & Takuya Ishihara & Toru Kitagawa, 2023. "Asymptotically Optimal Fixed-Budget Best Arm Identification with Variance-Dependent Bounds," Papers 2302.02988, arXiv.org, revised Jul 2023.

Most related items

These are the items that most often cite the same works as this one and are cited by the same works as this one.

Masahiro Kato & Masaaki Imaizumi & Takuya Ishihara & Toru Kitagawa, 2023. "Asymptotically Optimal Fixed-Budget Best Arm Identification with Variance-Dependent Bounds," Papers 2302.02988, arXiv.org, revised Jul 2023.
Arthur Charpentier & Romuald Élie & Carl Remlinger, 2023. "Reinforcement Learning in Economics and Finance," Computational Economics, Springer;Society for Computational Economics, vol. 62(1), pages 425-462, June.
Kock, Anders Bredahl & Preinerstorfer, David & Veliyev, Bezirgen, 2023. "Treatment recommendation with distributional targets," Journal of Econometrics, Elsevier, vol. 234(2), pages 624-646.
- Anders Bredahl Kock & David Preinerstorfer & Bezirgen Veliyev, 2020. "Treatment recommendation with distributional targets," Papers 2005.09717, arXiv.org, revised Apr 2022.
Karun Adusumilli & Friedrich Geiecke & Claudio Schilter, 2019. "Dynamically Optimal Treatment Allocation using Reinforcement Learning," Papers 1904.01047, arXiv.org, revised May 2022.
Toru Kitagawa & Guanyi Wang, 2021. "Who should get vaccinated? Individualized allocation of vaccines over SIR network," CeMMAP working papers CWP28/21, Centre for Microdata Methods and Practice, Institute for Fiscal Studies.
Arthur Charpentier & Romuald Elie & Carl Remlinger, 2020. "Reinforcement Learning in Economics and Finance," Papers 2003.10014, arXiv.org.
Karun Adusumilli, 2022. "Neyman allocation is minimax optimal for best arm identification with two arms," Papers 2204.05527, arXiv.org, revised Aug 2022.
Kitagawa, Toru & Wang, Guanyi, 2023. "Who should get vaccinated? Individualized allocation of vaccines over SIR network," Journal of Econometrics, Elsevier, vol. 232(1), pages 109-131.
Masahiro Kato & Masaaki Imaizumi & Takuya Ishihara & Toru Kitagawa, 2022. "Best Arm Identification with Contextual Information under a Small Gap," Papers 2209.07330, arXiv.org, revised Jan 2023.
Kaido, Hiroaki, 2017. "Asymptotically Efficient Estimation Of Weighted Average Derivatives With An Interval Censored Variable," Econometric Theory, Cambridge University Press, vol. 33(5), pages 1218-1241, October.
- Hiroaki Kaido, 2013. "Asymptotically Efficient Estimation of Weighted Average Derivatives with an Inverval Censored Variable," Boston University - Department of Economics - Working Papers Series 2013-022, Boston University - Department of Economics.
- Hiroaki Kaido, 2014. "Asymptotically efficient estimation of weighted average derivatives with an interval censored variable," CeMMAP working papers 03/14, Institute for Fiscal Studies.
- Hiroaki Kaido, 2014. "Asymptotically efficient estimation of weighted average derivatives with an interval censored variable," CeMMAP working papers CWP03/14, Centre for Microdata Methods and Practice, Institute for Fiscal Studies.
Sylvain Chassang, 2010. "Building Routines: Learning, Cooperation, and the Dynamics of Incomplete Relational Contracts," American Economic Review, American Economic Association, vol. 100(1), pages 448-465, March.
Karun Adusumilli, 2022. "How to sample and when to stop sampling: The generalized Wald problem and minimax policies," Papers 2210.15841, arXiv.org, revised Feb 2024.
Kyle Colangelo & Ying-Ying Lee, 2019. "Double debiased machine learning nonparametric inference with continuous treatments," CeMMAP working papers CWP72/19, Centre for Microdata Methods and Practice, Institute for Fiscal Studies.
Xiaohong Chen & Andres Santos, 2018. "Overidentification in Regular Models," Econometrica, Econometric Society, vol. 86(5), pages 1771-1817, September.
- Xiaohong Chen & Andres Santos, 2015. "Overidentification in Regular Models," Cowles Foundation Discussion Papers 1999, Cowles Foundation for Research in Economics, Yale University.
- Xiaohong Chen & Andres Santos, 2015. "Overidentification in Regular Models," Cowles Foundation Discussion Papers 1999R, Cowles Foundation for Research in Economics, Yale University, revised Jun 2018.
Martin Peitz & Sven Rady & Piers Trepper, 2017. "Experimentation in Two-Sided Markets," Journal of the European Economic Association, European Economic Association, vol. 15(1), pages 128-172.
- Peitz, Martin & Rady, Sven & Trepper, Piers, 2011. "Experimentation in Two-Sided Markets," Discussion Paper Series of SFB/TR 15 Governance and the Efficiency of Economic Systems 365, Free University of Berlin, Humboldt University of Berlin, University of Bonn, University of Mannheim, University of Munich.
- Peitz, Martin & Rady, Sven & Trepper, Piers, 2017. "Experimentation in Two-Sided Markets," Munich Reprints in Economics 55039, University of Munich, Department of Economics.
- Peitz, Martin & Rady, Sven & Trepper, Piers, 2013. "Experimentation in Two-Sided Markets," Working Papers 13-03, University of Mannheim, Department of Economics.
- Rady, Sven & Peitz, Martin & Trepper, Piers, 2011. "Experimentation in Two-Sided Markets," CEPR Discussion Papers 8670, C.E.P.R. Discussion Papers.
- Martin Peitz & Sven Rady & Piers Trepper, 2015. "Experimentation in Two-Sided Markets," CESifo Working Paper Series 5346, CESifo.
Kitagawa, Toru & Muris, Chris, 2016. "Model averaging in semiparametric estimation of treatment effects," Journal of Econometrics, Elsevier, vol. 193(1), pages 271-289.
- Toru Kitagawa & Chris Muris, 2015. "Model averaging in semiparametric estimation of treatment effects," CeMMAP working papers 46/15, Institute for Fiscal Studies.
- Toru Kitagawa & Chris Muris, 2015. "Model averaging in semiparametric estimation of treatment effects," CeMMAP working papers CWP46/15, Centre for Microdata Methods and Practice, Institute for Fiscal Studies.
Deniz Sevinc & Edgar Mata Flores & Simon Collinson, 2020. "Are there inequality spillovers? Evidence through a modified inequality measure and European dynamics of inequality," Working Papers 545, ECINEQ, Society for the Study of Economic Inequality.
Keller, Godfrey & Novák, Vladimír & Willems, Tim, 2019. "A note on optimal experimentation under risk aversion," Journal of Economic Theory, Elsevier, vol. 179(C), pages 476-487.
- Vladimir Novak & Tim Willems, 2018. "A Note on Optimal Experimentation under Risk Aversion," CERGE-EI Working Papers wp618, The Center for Economic Research and Graduate Education - Economics Institute, Prague.
Tetenov, Aleksey, 2012. "Statistical treatment choice based on asymmetric minimax regret criteria," Journal of Econometrics, Elsevier, vol. 166(1), pages 157-165.
- Aleksey Tetenov, 2009. "Statistical Treatment Choice Based on Asymmetric Minimax Regret Criteria," Carlo Alberto Notebooks 119, Collegio Carlo Alberto.
Giovanni L. Violante & Greg Kaplan, 2022. "The Marginal Propensity to Consume in Heterogeneous Agent Models," Annual Review of Economics, Annual Reviews, vol. 14(1), pages 747-775, August.
- Greg Kaplan & Giovanni L. Violante, 2021. "The Marginal Propensity to Consume in Heterogeneous Agent Models," Working Papers 2021-9, Princeton University. Economics Department..
- Violante, Giovanni & Kaplan, Greg, 2022. "The Marginal Propensity to Consume in Heterogeneous Agent Models," CEPR Discussion Papers 17271, C.E.P.R. Discussion Papers.
- Greg Kaplan & Giovanni L. Violante, 2022. "The Marginal Propensity to Consume in Heterogeneous Agent Models," NBER Working Papers 30013, National Bureau of Economic Research, Inc.

More about this item

NEP fields

This paper has been announced in the following NEP Reports:

NEP-RMG-2022-01-24 (Risk Management)

Statistics

Access and download statistics

Corrections

All material on this site has been provided by the respective publishers and authors. You can help correct errors and omissions. When requesting a correction, please mention this item's handle: RePEc:arx:papers:2112.06363. See general information about how to correct material in RePEc.

If you have authored this item and are not yet registered with RePEc, we encourage you to do it here. This allows to link your profile to this item. It also allows you to accept potential citations to this item that we are uncertain about.

If CitEc recognized a bibliographic reference but did not link an item in RePEc to it, you can help with this form .

If you know of missing items citing this one, you can help us creating those links by adding the relevant references in the same way as above, for each refering item. If you are a registered author of this item, you may also want to check the "citations" tab in your RePEc Author Service profile, as there may be some citations waiting for confirmation.

For technical questions regarding this item, or to correct its authors, title, abstract, bibliographic or download information, contact: arXiv administrators (email available below). General contact details of provider: http://arxiv.org/ .

Please note that corrections may take a couple of weeks to filter through the various RePEc services.

IDEAS is a RePEc service. RePEc uses bibliographic data supplied by the respective publishers.

Browse Econ Literature

More features

Risk and optimal policies in bandit experiments

Author

Abstract

Suggested Citation

Download full text from publisher

References listed on IDEAS

Citations

Most related items

More about this item

NEP fields

Statistics

Corrections

More services and features

MyIDEAS

Author registration

Rankings

RePEc Genealogy

RePEc Biblio

MPRA

New papers by email

EconAcademics

Plagiarism

About RePEc

RePEc home

Blog

Help/FAQ

RePEc team

Participating archives

Privacy statement

Help us

Corrections

Volunteers

Get papers listed

Open a RePEc archive

Get RePEc data