Earning While Learning: How to Run Batched Bandit Experiments

Earning While Learning: How to Run Batched Bandit Experiments

Author

Listed:

Kemper, Jan
Rostam-Afschar, Davud

Registered:

Davud Rostam-Afschar

Abstract

Researchers typically collect experimental data sequentially, allowing early outcome observations and adaptive treatment assignment to reduce exposure to inferior treatments. This article reviews multi-armed-bandit adaptive experimental designs that balance exploration and exploitation. Because adaptively collected experimental data through bandit algorithms violate standard asymptotics, inference is challenging. We implement an estimator that yields valid heteroskedasticity-robust confidence intervals in batched bandit designs and compare coverage in Monte Carlo simulations. We introduce bbandits for Stata, a tool for designing experiments via simulation, running interactive bandit experiments, and implementing and analyzing adaptively collected data. bbandits includes three common assignment algorithms-e-first, e-greedy, and Thompson sampling-and supports estimation, inference, and visualization.

Suggested Citation

Kemper, Jan & Rostam-Afschar, Davud, 2026. "Earning While Learning: How to Run Batched Bandit Experiments," GLO Discussion Paper Series 1717, Global Labor Organization (GLO).

Handle: RePEc:zbw:glodps:1717

Download full text from publisher

Other versions of this item:

Kemper, Jan & Rostam-Afschar, Davud, 2026. "Earning While Learning: How to Run Batched Bandit Experiments," IZA Discussion Papers 18429, IZA Network @ LISER.

References listed on IDEAS

Hadar Avivi & Patrick Kline & Evan Rose & Christopher Walters, 2021. "Adaptive Correspondence Experiments," AEA Papers and Proceedings, American Economic Association, vol. 111, pages 43-48, May.
- Hadar Avivi & Patrick M. Kline & Evan Rose & Christopher R. Walters, 2021. "Adaptive Correspondence Experiments," NBER Working Papers 28319, National Bureau of Economic Research, Inc.
Esther Duflo & Rema Hanna & Stephen P. Ryan, 2012. "Incentives Work: Getting Teachers to Come to School," American Economic Review, American Economic Association, vol. 102(4), pages 1241-1278, June.
Hack, Lukas & Rostam-Afschar, Davud, 2024. "Understanding Firm Dynamics with Daily Data," VfS Annual Conference 2024 (Berlin): Upcoming Labor Market Challenges 302376, Verein für Socialpolitik / German Economic Association.
- Lukas Hack & Davud Rostam-Afschar, 2024. "Understanding Firm Dynamics with Daily Data," CRC TR 224 Discussion Paper Series crctr224_2024_593, University of Bonn and University of Mannheim, Germany.
- Hack, Lukas & Rostam-Afschar, Davud, 2025. "Understanding Firm Dynamics with Daily Data," IZA Discussion Papers 17882, IZA Network @ LISER.
Steven L. Scott, 2010. "A modern Bayesian look at the multi‐armed bandit," Applied Stochastic Models in Business and Industry, John Wiley & Sons, vol. 26(6), pages 639-658, November.
Susan Athey & Guido W. Imbens, 2019. "Machine Learning Methods That Economists Should Know About," Annual Review of Economics, Annual Reviews, vol. 11(1), pages 685-725, August.
Gaul, Johannes J. & Keusch, Florian & Rostam-Afschar, Davud & Simon, Thomas, 2024. "Invitation Messages for Business Surveys: A Multi-Armed Bandit Experiment," IZA Discussion Papers 17534, IZA Network @ LISER.
- Gaul, Johannes J. & Keusch, Florian & Rostam-Afschar, Davud & Simon, Thomas, 2025. "Invitation messages for business surveys: A multi-armed bandit experiment," ZEW Discussion Papers 25-003, ZEW - Leibniz Centre for European Economic Research.
- Gaul, Johannes J. & Keusch, Florian & Rostam-Afschar, Davud & Simon, Thomas, 2024. "Invitation Messages for Business Surveys: A Multi-Armed Bandit Experiment," GLO Discussion Paper Series 1540, Global Labor Organization (GLO).
Zhan, Ruohan & Hadad, Vitor & Hirshberg, David A. & Athey, Susan, 2021. "Off-Policy Evaluation via Adaptive Weighting with Data from Contextual Bandits," Research Papers 3970, Stanford University, Graduate School of Business.
- Ruohan Zhan & Vitor Hadad & David A. Hirshberg & Susan Athey, 2021. "Off-Policy Evaluation via Adaptive Weighting with Data from Contextual Bandits," Papers 2106.02029, arXiv.org, revised Jun 2021.
Jan Kemper & Davud Rostam-Afschar, 2025. "Inference for Batched Adaptive Experiments," Papers 2512.10156, arXiv.org.
- Kemper, Jan & Rostam-Afschar, Davud, 2025. "Inference for batched adaptive experiments," ZEW Discussion Papers 25-070, ZEW - Leibniz Centre for European Economic Research.
Maximilian Kasy & Anja Sautmann, 2021. "Adaptive Treatment Assignment in Experiments for Policy Choice," Econometrica, Econometric Society, vol. 89(1), pages 113-132, January.
- Maximilian Kasy & Anja Sautmann, 2019. "Adaptive Treatment Assignment in Experiments for Policy Choice," CESifo Working Paper Series 7778, CESifo.
Jannis Bischof & Philipp Doerrenberg & Davud Rostam-Afschar & Dirk Simons & Johannes Voget, 2025. "The German Business Panel: Firm-Level Data for Accounting and Taxation Research," European Accounting Review, Taylor & Francis Journals, vol. 34(4), pages 1499-1527, August.

Full references (including those not matched with items on IDEAS)

Most related items

These are the items that most often cite the same works as this one and are cited by the same works as this one.

Kemper, Jan & Rostam-Afschar, Davud, 2025. "Inference for batched adaptive experiments," ZEW Discussion Papers 25-070, ZEW - Leibniz Centre for European Economic Research.
- Jan Kemper & Davud Rostam-Afschar, 2025. "Inference for Batched Adaptive Experiments," Papers 2512.10156, arXiv.org.
Chao Qin & Daniel Russo, 2024. "Optimizing Adaptive Experiments: A Unified Approach to Regret Minimization and Best-Arm Identification," Papers 2402.10592, arXiv.org, revised Jul 2024.
Michael Lechner, 2023. "Causal Machine Learning and its use for public policy," Swiss Journal of Economics and Statistics, Springer;Swiss Society of Economics and Statistics, vol. 159(1), pages 1-15, December.
Arthur Charpentier & Romuald Élie & Carl Remlinger, 2023. "Reinforcement Learning in Economics and Finance," Computational Economics, Springer;Society for Computational Economics, vol. 62(1), pages 425-462, June.
Gaul, Johannes J. & Keusch, Florian & Rostam-Afschar, Davud & Simon, Thomas, 2024. "Invitation Messages for Business Surveys: A Multi-Armed Bandit Experiment," GLO Discussion Paper Series 1540, Global Labor Organization (GLO).
- Gaul, Johannes J. & Keusch, Florian & Rostam-Afschar, Davud & Simon, Thomas, 2024. "Invitation Messages for Business Surveys: A Multi-Armed Bandit Experiment," IZA Discussion Papers 17534, IZA Network @ LISER.
- Gaul, Johannes J. & Keusch, Florian & Rostam-Afschar, Davud & Simon, Thomas, 2025. "Invitation messages for business surveys: A multi-armed bandit experiment," ZEW Discussion Papers 25-003, ZEW - Leibniz Centre for European Economic Research.
Arthur Charpentier & Romuald Elie & Carl Remlinger, 2020. "Reinforcement Learning in Economics and Finance," Papers 2003.10014, arXiv.org.
Gallego, Jorge & Rivero, Gonzalo & Martínez, Juan, 2021. "Preventing rather than punishing: An early warning model of malfeasance in public procurement," International Journal of Forecasting, Elsevier, vol. 37(1), pages 360-377.
- J Gallego & G Rivero & J.D. MartÔøΩnez, 2018. "Preventing rather than Punishing: An Early Warning Model of Malfeasance in Public Procurement," Documentos de Trabajo 16724, Universidad del Rosario.
Sebastian Galiani & Juan Pantano, 2021. "Structural Models: Inception and Frontier," NBER Working Papers 28698, National Bureau of Economic Research, Inc.
Frederico Finan & Demian Pouzo, 2021. "Reinforcing RCTs with Multiple Priors while Learning about External Validity," Papers 2112.09170, arXiv.org, revised Sep 2024.
A Stefano Caria & Grant Gordon & Maximilian Kasy & Simon Quinn & Soha Osman Shami & Alexander Teytelboym, 2024. "An Adaptive Targeted Field Experiment: Job Search Assistance for Refugees in Jordan," Journal of the European Economic Association, European Economic Association, vol. 22(2), pages 781-836.
- A. Stefano Caria & Grant Gordon & Maximilian Kasy & Simon Quinn & Soha Shami & Alexander Teytelboym, 2020. "An Adaptive Targeted Field Experiment: Job Search Assistance for Refugees in Jordan," CSAE Working Paper Series 2020-20, Centre for the Study of African Economies, University of Oxford.
- Caria, Stefano & Gordon, Grant & Kasy, Maximilian & Quinn, Simon & Shami, Soha & Teytelboym, Alexander, 2021. "An Adaptive Targeted Field Experiment: Job Search Assistance for Refugees in Jordan," CAGE Online Working Paper Series 547, Competitive Advantage in the Global Economy (CAGE).
- Caria, Stefano & Gordon, Grant & Kasy, Maximilian & Quinn, Simon & Shami, Soha & Teytelboym, Alexander, 2021. "An Adaptive Targeted Field Experiment : Job Search Assistance for Refugees in Jordan," The Warwick Economics Research Paper Series (TWERPS) 1335, University of Warwick, Department of Economics.
- Stefano Caria & Grant Gordon & Maximilian Kasy & Simon Quinn & Soha Shami & Alexander Teytelboym, 2020. "An Adaptive Targeted Field Experiment: Job Search Assistance for Refugees in Jordan," CESifo Working Paper Series 8535, CESifo.
- Quinn, Simon & Caria, Stefano & Gordon, Grant & Kasy, Maximilian & Shami, Soha & Teytelboym, Alexander, 2020. "An Adaptive Targeted Field Experiment: Job Search Assistance for Refugees in Jordan," CEPR Discussion Papers 15359, C.E.P.R. Discussion Papers.
Sophie-Charlotte Klose & Johannes Lederer, 2020. "A Pipeline for Variable Selection and False Discovery Rate Control With an Application in Labor Economics," Papers 2006.12296, arXiv.org, revised Jun 2020.
Kyle Colangelo & Ying-Ying Lee, 2019. "Double debiased machine learning nonparametric inference with continuous treatments," CeMMAP working papers CWP72/19, Centre for Microdata Methods and Practice, Institute for Fiscal Studies.
Teresa Molina Millán & Karen Macours, 2017. "Attrition in randomized control trials: Using tracking information to correct bias," FEUNL Working Paper Series novaf:wp1702, Universidade Nova de Lisboa, Faculdade de Economia.
Labib Shami & Teddy Lazebnik, 2024. "Implementing Machine Learning Methods in Estimating the Size of the Non-observed Economy," Computational Economics, Springer;Society for Computational Economics, vol. 63(4), pages 1459-1476, April.
Orazio Attanasio & Sarah Cattan & Emla Fitzsimons & Costas Meghir & Marta Rubio-Codina, 2020. "Estimating the Production Function for Human Capital: Results from a Randomized Controlled Trial in Colombia," American Economic Review, American Economic Association, vol. 110(1), pages 48-85, January.
- Orazio Attanasio & Sarah Cattan & Emla Fitzsimons & Costas Meghir & Marta Rubio Codina, 2015. "Estimating the production function for human capital: results from a randomized controlled trial in Colombia," IFS Working Papers W15/06, Institute for Fiscal Studies.
- Orazio Attanasio & Sarah Cattan & Emla Fitzsimons & Costas Meghir & Marta Rubio Codina, 2020. "Estimating the production function for human capital: results from a randomized controlled trial in Colombia," IFS Working Papers W20/3, Institute for Fiscal Studies.
- Orazio Attanasio & Sarah Cattan & Emla Fitzsimons & Costas Meghir & Marta Rubio Codina, 2018. "Estimating the production function for human capital: results from a randomized controlled trial in Colombia," IFS Working Papers W18/18, Institute for Fiscal Studies.
- Orazio Attanasio & Sarah Cattan & Emla Fitzsimons & Costas Meghir & Marta Rubio Codina, 2017. "Estimating the production function for human capital: results from a randomized controlled trial in Colombia," IFS Working Papers W17/06, Institute for Fiscal Studies.
Hurmeranta, Risto & Lyytikäinen, Teemu, 2025. "Nominal Loss Aversion in the Housing Market and Household Mobility," Working Papers 178, VATT Institute for Economic Research.
Chen, Ruoyu & Jiang, Hanchen & Quintero, Luis E., 2023. "Measuring the value of rent stabilization and understanding its implications for racial inequality: Evidence from New York City," Regional Science and Urban Economics, Elsevier, vol. 103(C).
- Chen, Ruoyu & Jiang, Hanchen & Quintero, Luis E., 2022. "Measuring the Value of Rent Stabilization and Understanding its Implications for Racial Inequality: Evidence from New York City," GLO Discussion Paper Series 1102, Global Labor Organization (GLO).
Jesse Rothstein, 2015. "Teacher Quality Policy When Supply Matters," American Economic Review, American Economic Association, vol. 105(1), pages 100-130, January.
- Jesse Rothstein, 2012. "Teacher quality policy when supply matters," Working Papers 2012/35, Institut d'Economia de Barcelona (IEB).
- Jesse Rothstein, 2012. "Teacher Quality Policy When Supply Matters," NBER Working Papers 18419, National Bureau of Economic Research, Inc.
- Rothstein, Jesse, 2012. "Teacher Quality Policy When Supply Matters," Institute for Research on Labor and Employment, Working Paper Series qt81q0f4bc, Institute of Industrial Relations, UC Berkeley.
- Rothstein, Jesse, 2012. "Teacher Quality Policy When Supply Matters," Department of Economics, Working Paper Series qt81q0f4bc, Department of Economics, Institute for Business and Economic Research, UC Berkeley.
Victor Iajya & Nicola Lacetera & Mario Macis & Robert Slonim, 2012. "The Effects of Information, Social and Economic Incentives on Voluntary Undirected Blood Donations: Evidence from a Randomized Controlled Trial in Argentina," NBER Working Papers 18630, National Bureau of Economic Research, Inc.
Dang, Hai-Anh & Carleto, Gero & Gourlay, Sydney & Abanokova, Kseniya, 2023. "Addressing Soil Quality Data Gaps with Imputation: Evidence from Ethiopia and Uganda," 2023 Annual Meeting, July 23-25, Washington D.C. 335648, Agricultural and Applied Economics Association.
- Dang, Hai-Anh H & Carletto, Calogero & Gourlay, Sydney & Abanokova, Kseniya, 2024. "Addressing Soil Quality Data Gaps with Imputation: Evidence from Ethiopia and Uganda," IZA Discussion Papers 17064, IZA Network @ LISER.
- Dang, Hai-Anh & Carletto, Calogero & Gourlay, Sydney & Abanokova, Kseniya, 2024. "Addressing Soil Quality Data Gaps with Imputation: Evidence from Ethiopia and Uganda," GLO Discussion Paper Series 1445, Global Labor Organization (GLO).

More about this item

Keywords

; ; ; ; ;

JEL classification:

C1 - Mathematical and Quantitative Methods - - Econometric and Statistical Methods and Methodology: General
C11 - Mathematical and Quantitative Methods - - Econometric and Statistical Methods and Methodology: General - - - Bayesian Analysis: General
C12 - Mathematical and Quantitative Methods - - Econometric and Statistical Methods and Methodology: General - - - Hypothesis Testing: General
C13 - Mathematical and Quantitative Methods - - Econometric and Statistical Methods and Methodology: General - - - Estimation: General
C15 - Mathematical and Quantitative Methods - - Econometric and Statistical Methods and Methodology: General - - - Statistical Simulation Methods: General
C18 - Mathematical and Quantitative Methods - - Econometric and Statistical Methods and Methodology: General - - - Methodolical Issues: General
C8 - Mathematical and Quantitative Methods - - Data Collection and Data Estimation Methodology; Computer Programs
C87 - Mathematical and Quantitative Methods - - Data Collection and Data Estimation Methodology; Computer Programs - - - Econometric Software
C88 - Mathematical and Quantitative Methods - - Data Collection and Data Estimation Methodology; Computer Programs - - - Other Computer Software
C9 - Mathematical and Quantitative Methods - - Design of Experiments
D83 - Microeconomics - - Information, Knowledge, and Uncertainty - - - Search; Learning; Information and Knowledge; Communication; Belief; Unawareness

NEP fields

This paper has been announced in the following NEP Reports:

NEP-CMP-2026-03-02 (Computational Economics)
NEP-ECM-2026-03-02 (Econometrics)
NEP-EXP-2026-03-02 (Experimental Economics)

Statistics

Access and download statistics

Corrections

All material on this site has been provided by the respective publishers and authors. You can help correct errors and omissions. When requesting a correction, please mention this item's handle: RePEc:zbw:glodps:1717. See general information about how to correct material in RePEc.

If you have authored this item and are not yet registered with RePEc, we encourage you to do it here. This allows to link your profile to this item. It also allows you to accept potential citations to this item that we are uncertain about.

If CitEc recognized a bibliographic reference but did not link an item in RePEc to it, you can help with this form .

If you know of missing items citing this one, you can help us creating those links by adding the relevant references in the same way as above, for each refering item. If you are a registered author of this item, you may also want to check the "citations" tab in your RePEc Author Service profile, as there may be some citations waiting for confirmation.

For technical questions regarding this item, or to correct its authors, title, abstract, bibliographic or download information, contact: ZBW - Leibniz Information Centre for Economics (email available below). General contact details of provider: https://edirc.repec.org/data/glabode.html .

Please note that corrections may take a couple of weeks to filter through the various RePEc services.

IDEAS is a RePEc service. RePEc uses bibliographic data supplied by the respective publishers.

Browse Econ Literature

More features

Earning While Learning: How to Run Batched Bandit Experiments

Author

Abstract

Suggested Citation

Download full text from publisher

Other versions of this item:

References listed on IDEAS

Most related items

More about this item

Keywords

JEL classification:

NEP fields

Statistics

Corrections

More services and features

MyIDEAS

Author registration

Rankings

RePEc Genealogy

RePEc Biblio

MPRA

New papers by email

EconAcademics

Plagiarism

About RePEc

RePEc home

Blog

Help/FAQ

RePEc team

Participating archives

Privacy statement

Help us

Corrections

Volunteers

Get papers listed

Open a RePEc archive

Get RePEc data