Some performance considerations when using multi-armed bandit algorithms in the presence of missing data

My bibliography Save this article

Some performance considerations when using multi-armed bandit algorithms in the presence of missing data

Author

Listed:

Xijin Chen
Kim May Lee
Sofia S Villar
David S Robertson

Registered:

Abstract

When comparing the performance of multi-armed bandit algorithms, the potential impact of missing data is often overlooked. In practice, it also affects their implementation where the simplest approach to overcome this is to continue to sample according to the original bandit algorithm, ignoring missing outcomes. We investigate the impact on performance of this approach to deal with missing data for several bandit algorithms through an extensive simulation study assuming the rewards are missing at random. We focus on two-armed bandit algorithms with binary outcomes in the context of patient allocation for clinical trials with relatively small sample sizes. However, our results apply to other applications of bandit algorithms where missing data is expected to occur. We assess the resulting operating characteristics, including the expected reward. Different probabilities of missingness in both arms are considered. The key finding of our work is that when using the simplest strategy of ignoring missing data, the impact on the expected performance of multi-armed bandit strategies varies according to the way these strategies balance the exploration-exploitation trade-off. Algorithms that are geared towards exploration continue to assign samples to the arm with more missing responses (which being perceived as the arm with less observed information is deemed more appealing by the algorithm than it would otherwise be). In contrast, algorithms that are geared towards exploitation would rapidly assign a high value to samples from the arms with a current high mean irrespective of the level observations per arm. Furthermore, for algorithms focusing more on exploration, we illustrate that the problem of missing responses can be alleviated using a simple mean imputation approach.

Suggested Citation

Xijin Chen & Kim May Lee & Sofia S Villar & David S Robertson, 2022. "Some performance considerations when using multi-armed bandit algorithms in the presence of missing data," PLOS ONE, Public Library of Science, vol. 17(9), pages 1-28, September.

Handle: RePEc:plo:pone00:0274272
DOI: 10.1371/journal.pone.0274272

Download full text from publisher

References listed on IDEAS

Hamsa Bastani & Mohsen Bayati, 2020. "Online Decision Making with High-Dimensional Covariates," Operations Research, INFORMS, vol. 68(1), pages 276-294, January.
Lanju Zhang & William F. Rosenberger, 2006. "Response-Adaptive Randomization for Clinical Trials with Continuous Outcomes," Biometrics, The International Biometric Society, vol. 62(2), pages 562-569, June.
Biswas, Atanu & Rao, J.N.K., 2004. "Missing responses in adaptive allocation design," Statistics & Probability Letters, Elsevier, vol. 70(1), pages 59-70, October.
William F. Rosenberger & Nigel Stallard & Anastasia Ivanova & Cherice N. Harper & Michelle L. Ricks, 2001. "Optimal Adaptive Designs for Binary Response Trials," Biometrics, The International Biometric Society, vol. 57(3), pages 909-913, September.
Ick Hoon Jin & Suyu Liu & Peter F. Thall & Ying Yuan, 2014. "Using Data Augmentation to Facilitate Conduct of Phase I-II Clinical Trials With Delayed Outcomes," Journal of the American Statistical Association, Taylor & Francis Journals, vol. 109(506), pages 525-536, June.

Full references (including those not matched with items on IDEAS)

Most related items

These are the items that most often cite the same works as this one and are cited by the same works as this one.

Alessandro Baldi Antognini & Marco Novelli & Maroussa Zagoraiou, 2022. "A simple solution to the inadequacy of asymptotic likelihood-based inference for response-adaptive clinical trials," Statistical Papers, Springer, vol. 63(1), pages 157-180, February.
Biswas, Atanu & Bhattacharya, Rahul, 2010. "An optimal response-adaptive design with dual constraints," Statistics & Probability Letters, Elsevier, vol. 80(3-4), pages 177-185, February.
Atkinson, Anthony C. & Biswas, Atanu, 2017. "Optimal response and covariate-adaptive biased-coin designs for clinical trials with continuous multivariate or longitudinal responses," LSE Research Online Documents on Economics 66761, London School of Economics and Political Science, LSE Library.
Biswas, Atanu & Bhattacharya, Rahul, 2011. "Optimal response-adaptive allocation designs in phase III clinical trials: Incorporating ethics in optimality," Statistics & Probability Letters, Elsevier, vol. 81(8), pages 1155-1160, August.
Atkinson, Anthony C. & Biswas, Atanu, 2017. "Optimal response and covariate-adaptive biased-coin designs for clinical trials with continuous multivariate or longitudinal responses," Computational Statistics & Data Analysis, Elsevier, vol. 113(C), pages 297-310.
Yi, Yanqing, 2013. "Exact statistical power for response adaptive designs," Computational Statistics & Data Analysis, Elsevier, vol. 58(C), pages 201-209.
Uttam Bandyopadhyay & Rahul Bhattacharya, 2009. "Response adaptive procedures with dual optimality," Statistica Neerlandica, Netherlands Society for Statistics and Operations Research, vol. 63(3), pages 353-367, August.
Alessandro Baldi Antognini & Marco Novelli & Maroussa Zagoraiou, 2022. "A new inferential approach for response-adaptive clinical trials: the variance-stabilized bootstrap," TEST: An Official Journal of the Spanish Society of Statistics and Operations Research, Springer;Sociedad de Estadística e Investigación Operativa, vol. 31(1), pages 235-254, March.
Uttam Bandyopadhyay & Atanu Biswas & Shirsendu Mukherjee, 2009. "Adaptive two-treatment two-period crossover design for binary treatment responses incorporating carry-over effects," Statistical Methods & Applications, Springer;Società Italiana di Statistica, vol. 18(1), pages 13-33, March.
Hengtao Zhang & Guosheng Yin, 2021. "Response‐adaptive rerandomization," Journal of the Royal Statistical Society Series C, Royal Statistical Society, vol. 70(5), pages 1281-1298, November.
Yifei Zhang & Sha Cao & Chi Zhang & Ick Hoon Jin & Yong Zang, 2021. "A Bayesian adaptive phase I/II clinical trial design with late‐onset competing risk outcomes," Biometrics, The International Biometric Society, vol. 77(3), pages 796-808, September.
Ruohan Zhan & Zhimei Ren & Susan Athey & Zhengyuan Zhou, 2024. "Policy Learning with Adaptively Collected Data," Management Science, INFORMS, vol. 70(8), pages 5270-5297, August.
- Ruohan Zhan & Zhimei Ren & Susan Athey & Zhengyuan Zhou, 2021. "Policy Learning with Adaptively Collected Data," Papers 2105.02344, arXiv.org, revised Nov 2022.
- Zhan, Ruohan & Ren, Zhimei & Athey, Susan & Zhou, Zhengyuan, 2021. "Policy Learning with Adaptively Collected Data," Research Papers 3963, Stanford University, Graduate School of Business.
Rong Jin & David Simchi-Levi & Li Wang & Xinshang Wang & Sen Yang, 2021. "Shrinking the Upper Confidence Bound: A Dynamic Product Selection Problem for Urban Warehouses," Management Science, INFORMS, vol. 67(8), pages 4756-4771, August.
Beibei Guo & Ying Yuan, 2023. "DROID: dose‐ranging approach to optimizing dose in oncology drug development," Biometrics, The International Biometric Society, vol. 79(4), pages 2907-2919, December.
Hanan Hammouri & Marwan Alquran & Ruwa Abdel Muhsen & Jaser Altahat, 2022. "Optimal Weighted Multiple-Testing Procedure for Clinical Trials," Mathematics, MDPI, vol. 10(12), pages 1-19, June.
Rahul Bhattacharya & Madhumita Shome, 2015. "A randomized two stage allocation for continuous response clinical trials," Statistical Methods & Applications, Springer;Società Italiana di Statistica, vol. 24(3), pages 373-386, September.
Yining Wang & Boxiao Chen & David Simchi-Levi, 2021. "Multimodal Dynamic Pricing," Management Science, INFORMS, vol. 67(10), pages 6136-6152, October.
Kimia Keshanian & Daniel Zantedeschi & Kaushik Dutta, 2022. "Features Selection as a Nash-Bargaining Solution: Applications in Online Advertising and Information Systems," INFORMS Journal on Computing, INFORMS, vol. 34(5), pages 2485-2501, September.
Yanqing Yi & Yuan Yuan, 2013. "An optimal allocation for response-adaptive designs," Journal of Applied Statistics, Taylor & Francis Journals, vol. 40(9), pages 1996-2008, September.
Ruicheng Ao & Hongyu Chen & David Simchi-Levi, 2024. "Prediction-Guided Active Experiments," Papers 2411.12036, arXiv.org, revised Nov 2024.

More about this item

Statistics

Access and download statistics

Corrections

All material on this site has been provided by the respective publishers and authors. You can help correct errors and omissions. When requesting a correction, please mention this item's handle: RePEc:plo:pone00:0274272. See general information about how to correct material in RePEc.

If you have authored this item and are not yet registered with RePEc, we encourage you to do it here. This allows to link your profile to this item. It also allows you to accept potential citations to this item that we are uncertain about.

If CitEc recognized a bibliographic reference but did not link an item in RePEc to it, you can help with this form .

If you know of missing items citing this one, you can help us creating those links by adding the relevant references in the same way as above, for each refering item. If you are a registered author of this item, you may also want to check the "citations" tab in your RePEc Author Service profile, as there may be some citations waiting for confirmation.

For technical questions regarding this item, or to correct its authors, title, abstract, bibliographic or download information, contact: plosone (email available below). General contact details of provider: https://journals.plos.org/plosone/ .

Please note that corrections may take a couple of weeks to filter through the various RePEc services.

IDEAS is a RePEc service. RePEc uses bibliographic data supplied by the respective publishers.

Browse Econ Literature

More features

Some performance considerations when using multi-armed bandit algorithms in the presence of missing data

Author

Abstract

Suggested Citation

Download full text from publisher

References listed on IDEAS

Most related items

More about this item

Statistics

Corrections

More services and features

MyIDEAS

Author registration

Rankings

RePEc Genealogy

RePEc Biblio

MPRA

New papers by email

EconAcademics

Plagiarism

About RePEc

RePEc home

Blog

Help/FAQ

RePEc team

Participating archives

Privacy statement

Help us

Corrections

Volunteers

Get papers listed

Open a RePEc archive

Get RePEc data