Online Action Learning in High Dimensions: A Conservative Perspective

Online Action Learning in High Dimensions: A Conservative Perspective

Author

Listed:

Claudio Cardoso Flores
Marcelo Cunha Medeiros

Registered:

Marcelo C. Medeiros

Abstract

Sequential learning problems are common in several fields of research and practical applications. Examples include dynamic pricing and assortment, design of auctions and incentives and permeate a large number of sequential treatment experiments. In this paper, we extend one of the most popular learning solutions, the $\epsilon_t$-greedy heuristics, to high-dimensional contexts considering a conservative directive. We do this by allocating part of the time the original rule uses to adopt completely new actions to a more focused search in a restrictive set of promising actions. The resulting rule might be useful for practical applications that still values surprises, although at a decreasing rate, while also has restrictions on the adoption of unusual actions. With high probability, we find reasonable bounds for the cumulative regret of a conservative high-dimensional decaying $\epsilon_t$-greedy rule. Also, we provide a lower bound for the cardinality of the set of viable actions that implies in an improved regret bound for the conservative version when compared to its non-conservative counterpart. Additionally, we show that end-users have sufficient flexibility when establishing how much safety they want, since it can be tuned without impacting theoretical properties. We illustrate our proposal both in a simulation exercise and using a real dataset.

Suggested Citation

Claudio Cardoso Flores & Marcelo Cunha Medeiros, 2020. "Online Action Learning in High Dimensions: A Conservative Perspective," Papers 2009.13961, arXiv.org, revised Mar 2024.

Handle: RePEc:arx:papers:2009.13961

Download full text from publisher

References listed on IDEAS

Kock, Anders Bredahl & Preinerstorfer, David & Veliyev, Bezirgen, 2023. "Treatment recommendation with distributional targets," Journal of Econometrics, Elsevier, vol. 234(2), pages 624-646.
- Anders Bredahl Kock & David Preinerstorfer & Bezirgen Veliyev, 2020. "Treatment recommendation with distributional targets," Papers 2005.09717, arXiv.org, revised Apr 2022.
Hamsa Bastani & Mohsen Bayati, 2020. "Online Decision Making with High-Dimensional Covariates," Operations Research, INFORMS, vol. 68(1), pages 276-294, January.
Sanath Kumar Krishnamurthy & Susan Athey, 2020. "Survey Bandits with Regret Guarantees," Papers 2002.09814, arXiv.org.
Danielle Li & Lindsey R. Raymond & Peter Bergman, 2020. "Hiring as Exploration," NBER Working Papers 27736, National Bureau of Economic Research, Inc.
- Danielle Li & Lindsey Raymond & Peter Bergman, 2024. "Hiring as Exploration," Papers 2411.03616, arXiv.org.
Carvalho, Carlos & Masini, Ricardo & Medeiros, Marcelo C., 2018. "ArCo: An artificial counterfactual approach for high-dimensional panel time-series data," Journal of Econometrics, Elsevier, vol. 207(2), pages 352-380.
- Carlos Viana de Carvalho & Ricardo Masini & Marcelo Cunha Medeiros, 2016. "ARCO: an artificial counterfactual approach for high-dimensional panel time-series data," Textos para discussão 653, Department of Economics PUC-Rio (Brazil).
- Carvalho, Carlos Viana de & Masini, Ricardo Pereira & Medeiros, Marcelo C., 2017. "Arco: an artificial counterfactual approach for high-dimensional panel time-series data," Textos para discussão 454, FGV EESP - Escola de Economia de São Paulo, Fundação Getulio Vargas (Brazil).
Denis Sauré & Assaf Zeevi, 2013. "Optimal Dynamic Assortment Planning with Demand Learning," Manufacturing & Service Operations Management, INFORMS, vol. 15(3), pages 387-404, July.
Anders Bredahl Kock & David Preinerstorfer & Bezirgen Veliyev, 2022. "Functional Sequential Treatment Allocation," Journal of the American Statistical Association, Taylor & Francis Journals, vol. 117(539), pages 1311-1323, September.
- Anders Bredahl Kock & David Preinerstorfer & Bezirgen Veliyev, 2018. "Functional Sequential Treatment Allocation," Papers 1812.09408, arXiv.org, revised Aug 2020.

Full references (including those not matched with items on IDEAS)

Most related items

These are the items that most often cite the same works as this one and are cited by the same works as this one.

Anders Bredahl Kock & David Preinerstorfer, 2024. "Regularizing Fairness in Optimal Policy Learning with Distributional Targets," Papers 2401.17909, arXiv.org, revised May 2025.
Marçal, Emerson Fernandes & Cunha, Ronan & Merlin, Giovanni Tondin & Simões, Oscar, 2017. "The aftermath of 2008 turmoil on Brazilian economy: Tsunami or “Marolinha”?," Textos para discussão 459, FGV EESP - Escola de Economia de São Paulo, Fundação Getulio Vargas (Brazil).
Francetich, Alejandro & Kreps, David, 2020. "Choosing a good toolkit, II: Bayes-rule based heuristics," Journal of Economic Dynamics and Control, Elsevier, vol. 111(C).
Dennis Shen & Peng Ding & Jasjeet Sekhon & Bin Yu, 2022. "Same Root Different Leaves: Time Series and Cross-Sectional Methods in Panel Data," Papers 2207.14481, arXiv.org, revised Oct 2022.
Zhentao Shi & Jin Xi & Haitian Xie, 2025. "A Synthetic Business Cycle Approach to Counterfactual Analysis with Nonstationary Macroeconomic Data," Papers 2505.22388, arXiv.org.
Michael Funke & Helery Tasane, 2025. "Regional economic impacts of the Øresund cross-border fixed link: Cui Bono?," Regional Studies, Taylor & Francis Journals, vol. 59(1), pages 2573115-257, December.
- Michael Funke & Kadri Männasoo & Helery Tasane, 2023. "Regional Economic Impacts of the Øresund Cross-Border Fixed Link: Cui Bono?," CESifo Working Paper Series 10557, CESifo.
Diana Moreira & Santiago Pérez, 2022. "Who Benefits from Meritocracy?," NBER Working Papers 30113, National Bureau of Economic Research, Inc.
- Moreira, Diana B. & Perez, Santiago, 2022. "Who Benefits from Meritocracy?," IZA Discussion Papers 15341, Institute of Labor Economics (IZA).
Rong Jin & David Simchi-Levi & Li Wang & Xinshang Wang & Sen Yang, 2021. "Shrinking the Upper Confidence Bound: A Dynamic Product Selection Problem for Urban Warehouses," Management Science, INFORMS, vol. 67(8), pages 4756-4771, August.
Jonas Radbruch & Amelie Schiprowski, 2025. "Interview Sequences and the Formation of Subjective Assessments," The Review of Economic Studies, Review of Economic Studies Ltd, vol. 92(2), pages 1226-1256.
- Jonas Radbruch & Amelie Schiprowski, 2020. "Interview Sequences and the Formation of Subjective Assessments," ECONtribute Discussion Papers Series 045, University of Bonn and University of Cologne, Germany.
- Radbruch, Jonas & Schiprowski, Amelie, 2024. "Interview Sequences and the Formation of Subjective Assessments," CEPR Discussion Papers 18839, C.E.P.R. Discussion Papers.
- Jonas Radbruch & Amelie Schiprowski, 2024. "Interview Sequences and the Formation of Subjective Assessments," Rationality and Competition Discussion Paper Series 497, CRC TRR 190 Rationality and Competition.
- Jonas Radbruch & Amelie Schiprowski, 2021. "Interview Sequences and the Formation of Subjective Assessments," CRC TR 224 Discussion Paper Series crctr224_2021_268v2, University of Bonn and University of Mannheim, Germany.
- Jonas Radbruch & Amelie Schiprowski, 2024. "Interview Sequences and the Formation of Subjective Assessments," CESifo Working Paper Series 10957, CESifo.
- Radbruch, Jonas & Schiprowski, Amelie, 2021. "Interview Sequences and the Formation of Subjective Assessments," IZA Discussion Papers 14799, Institute of Labor Economics (IZA).
Michael Lechner, 2023. "Causal Machine Learning and its use for public policy," Swiss Journal of Economics and Statistics, Springer;Swiss Society of Economics and Statistics, vol. 159(1), pages 1-15, December.
Xiong, Ruoxuan & Pelger, Markus, 2023. "Large dimensional latent factor modeling with missing observations and applications to causal inference," Journal of Econometrics, Elsevier, vol. 233(1), pages 271-301.
- Ruoxuan Xiong & Markus Pelger, 2019. "Large Dimensional Latent Factor Modeling with Missing Observations and Applications to Causal Inference," Papers 1910.08273, arXiv.org, revised Jan 2022.
Mou, Shandong & Robb, David J. & DeHoratius, Nicole, 2018. "Retail store operations: Literature review and research directions," European Journal of Operational Research, Elsevier, vol. 265(2), pages 399-422.
Zhentao Shi & Yishu Wang, 2025. "L2-relaxation for Economic Prediction," Papers 2510.12183, arXiv.org.
Coraggio, Luca & Pagano, Marco & Scognamiglio, Annalisa & Tåg, Joacim, 2025. "JAQ of all trades: Job mismatch, firm productivity and managerial quality," Journal of Financial Economics, Elsevier, vol. 164(C).
- Luca Coraggio & Marco Pagano & Annalisa Scognamiglio & Joacim Tåg, 2022. "JAQ of All Trades: Job Mismatch, Firm Productivity and Managerial Quality," EIEF Working Papers Series 2205, Einaudi Institute for Economics and Finance (EIEF), revised Mar 2022.
- Coraggio, Luca & Pagano, Marco & Scognamiglio, Annalisa & TÃ¥g, Joacim, 2022. "JAQ of All Trades: Job Mismatch, Firm Productivity and Managerial Quality," CEPR Discussion Papers 17167, C.E.P.R. Discussion Papers.
- Coraggio, Luca & Pagano, Marco & Scognamiglio, Annalisa & Tåg, Joacim, 2022. "JAQ of All Trades: Job Mismatch, Firm Productivity and Managerial Quality," Working Paper Series 1427, Research Institute of Industrial Economics, revised 11 Dec 2024.
Jason Poulos & Andrea Albanese & Andrea Mercatanti & Fan Li, 2021. "Retrospective causal inference via matrix completion, with an evaluation of the effect of European integration on cross-border employment," Papers 2106.00788, arXiv.org.
- Jason Poulos & Andrea Albanese & Andrea Mercatanti & Fan Li, 2021. "Retrospective causal inference via matrix completion, with an evaluation of the effect of European integration on cross-border employment," LISER Working Paper Series 2021-07, Luxembourg Institute of Socio-Economic Research (LISER).
- Poulos, Jason & Albanese, Andrea & Mercatanti, Andrea & Li, Fan, 2021. "Retrospective Causal Inference via Matrix Completion, with an Evaluation of the Effect of European Integration on Cross-Border Employment," IZA Discussion Papers 14472, Institute of Labor Economics (IZA).
Agrawal, Priyank & Tulabandhula, Theja & Avadhanula, Vashist, 2023. "A tractable online learning algorithm for the multinomial logit contextual bandit," European Journal of Operational Research, Elsevier, vol. 310(2), pages 737-750.
Kock, Anders Bredahl & Preinerstorfer, David & Veliyev, Bezirgen, 2023. "Treatment recommendation with distributional targets," Journal of Econometrics, Elsevier, vol. 234(2), pages 624-646.
- Anders Bredahl Kock & David Preinerstorfer & Bezirgen Veliyev, 2020. "Treatment recommendation with distributional targets," Papers 2005.09717, arXiv.org, revised Apr 2022.
Gel, Esma S. & Salman, F. Sibel, 2022. "Dynamic ordering decisions with approximate learning of supply yield uncertainty," International Journal of Production Economics, Elsevier, vol. 243(C).
Yining Wang & Boxiao Chen & David Simchi-Levi, 2021. "Multimodal Dynamic Pricing," Management Science, INFORMS, vol. 67(10), pages 6136-6152, October.
Kimia Keshanian & Daniel Zantedeschi & Kaushik Dutta, 2022. "Features Selection as a Nash-Bargaining Solution: Applications in Online Advertising and Information Systems," INFORMS Journal on Computing, INFORMS, vol. 34(5), pages 2485-2501, September.

More about this item

NEP fields

This paper has been announced in the following NEP Reports:

NEP-RMG-2020-10-19 (Risk Management)

Statistics

Access and download statistics

Corrections

All material on this site has been provided by the respective publishers and authors. You can help correct errors and omissions. When requesting a correction, please mention this item's handle: RePEc:arx:papers:2009.13961. See general information about how to correct material in RePEc.

If you have authored this item and are not yet registered with RePEc, we encourage you to do it here. This allows to link your profile to this item. It also allows you to accept potential citations to this item that we are uncertain about.

If CitEc recognized a bibliographic reference but did not link an item in RePEc to it, you can help with this form .

If you know of missing items citing this one, you can help us creating those links by adding the relevant references in the same way as above, for each refering item. If you are a registered author of this item, you may also want to check the "citations" tab in your RePEc Author Service profile, as there may be some citations waiting for confirmation.

For technical questions regarding this item, or to correct its authors, title, abstract, bibliographic or download information, contact: arXiv administrators (email available below). General contact details of provider: http://arxiv.org/ .

Please note that corrections may take a couple of weeks to filter through the various RePEc services.

IDEAS is a RePEc service. RePEc uses bibliographic data supplied by the respective publishers.

Browse Econ Literature

More features

Online Action Learning in High Dimensions: A Conservative Perspective

Author

Abstract

Suggested Citation

Download full text from publisher

References listed on IDEAS

Most related items

More about this item

NEP fields

Statistics

Corrections

More services and features

MyIDEAS

Author registration

Rankings

RePEc Genealogy

RePEc Biblio

MPRA

New papers by email

EconAcademics

Plagiarism

About RePEc

RePEc home

Blog

Help/FAQ

RePEc team

Participating archives

Privacy statement

Help us

Corrections

Volunteers

Get papers listed

Open a RePEc archive

Get RePEc data