IDEAS home Printed from https://ideas.repec.org/p/arx/papers/2009.13961.html
   My bibliography  Save this paper

Online Action Learning in High Dimensions: A Conservative Perspective

Author

Listed:
  • Claudio Cardoso Flores
  • Marcelo Cunha Medeiros

Abstract

Sequential learning problems are common in several fields of research and practical applications. Examples include dynamic pricing and assortment, design of auctions and incentives and permeate a large number of sequential treatment experiments. In this paper, we extend one of the most popular learning solutions, the $\epsilon_t$-greedy heuristics, to high-dimensional contexts considering a conservative directive. We do this by allocating part of the time the original rule uses to adopt completely new actions to a more focused search in a restrictive set of promising actions. The resulting rule might be useful for practical applications that still values surprises, although at a decreasing rate, while also has restrictions on the adoption of unusual actions. With high probability, we find reasonable bounds for the cumulative regret of a conservative high-dimensional decaying $\epsilon_t$-greedy rule. Also, we provide a lower bound for the cardinality of the set of viable actions that implies in an improved regret bound for the conservative version when compared to its non-conservative counterpart. Additionally, we show that end-users have sufficient flexibility when establishing how much safety they want, since it can be tuned without impacting theoretical properties. We illustrate our proposal both in a simulation exercise and using a real dataset.

Suggested Citation

  • Claudio Cardoso Flores & Marcelo Cunha Medeiros, 2020. "Online Action Learning in High Dimensions: A Conservative Perspective," Papers 2009.13961, arXiv.org, revised Mar 2024.
  • Handle: RePEc:arx:papers:2009.13961
    as

    Download full text from publisher

    File URL: http://arxiv.org/pdf/2009.13961
    File Function: Latest version
    Download Restriction: no
    ---><---

    References listed on IDEAS

    as
    1. Kock, Anders Bredahl & Preinerstorfer, David & Veliyev, Bezirgen, 2023. "Treatment recommendation with distributional targets," Journal of Econometrics, Elsevier, vol. 234(2), pages 624-646.
    2. Hamsa Bastani & Mohsen Bayati, 2020. "Online Decision Making with High-Dimensional Covariates," Operations Research, INFORMS, vol. 68(1), pages 276-294, January.
    3. Anders Bredahl Kock & David Preinerstorfer & Bezirgen Veliyev, 2022. "Functional Sequential Treatment Allocation," Journal of the American Statistical Association, Taylor & Francis Journals, vol. 117(539), pages 1311-1323, September.
    4. Sanath Kumar Krishnamurthy & Susan Athey, 2020. "Survey Bandits with Regret Guarantees," Papers 2002.09814, arXiv.org.
    5. Danielle Li & Lindsey R. Raymond & Peter Bergman, 2020. "Hiring as Exploration," NBER Working Papers 27736, National Bureau of Economic Research, Inc.
    6. Carvalho, Carlos & Masini, Ricardo & Medeiros, Marcelo C., 2018. "ArCo: An artificial counterfactual approach for high-dimensional panel time-series data," Journal of Econometrics, Elsevier, vol. 207(2), pages 352-380.
    7. Denis Sauré & Assaf Zeevi, 2013. "Optimal Dynamic Assortment Planning with Demand Learning," Manufacturing & Service Operations Management, INFORMS, vol. 15(3), pages 387-404, July.
    Full references (including those not matched with items on IDEAS)

    Most related items

    These are the items that most often cite the same works as this one and are cited by the same works as this one.
    1. Anders Bredahl Kock & David Preinerstorfer, 2024. "Regularizing Discrimination in Optimal Policy Learning with Distributional Targets," Papers 2401.17909, arXiv.org.
    2. Marçal, Emerson Fernandes & Cunha, Ronan & Merlin, Giovanni Tondin & Simões, Oscar, 2017. "The aftermath of 2008 turmoil on Brazilian economy: Tsunami or “Marolinha”?," Textos para discussão 459, FGV EESP - Escola de Economia de São Paulo, Fundação Getulio Vargas (Brazil).
    3. Francetich, Alejandro & Kreps, David, 2020. "Choosing a good toolkit, II: Bayes-rule based heuristics," Journal of Economic Dynamics and Control, Elsevier, vol. 111(C).
    4. Dennis Shen & Peng Ding & Jasjeet Sekhon & Bin Yu, 2022. "Same Root Different Leaves: Time Series and Cross-Sectional Methods in Panel Data," Papers 2207.14481, arXiv.org, revised Oct 2022.
    5. Michael Funke & Kadri Männasoo & Helery Tasane, 2023. "Regional Economic Impacts of the Øresund Cross-Border Fixed Link: Cui Bono?," CESifo Working Paper Series 10557, CESifo.
    6. Diana Moreira & Santiago Pérez, 2022. "Who Benefits from Meritocracy?," NBER Working Papers 30113, National Bureau of Economic Research, Inc.
    7. Jonas Radbruch & Amelie Schiprowski, 2020. "Interview Sequences and the Formation of Subjective Assessments," ECONtribute Discussion Papers Series 045, University of Bonn and University of Cologne, Germany.
    8. Rong Jin & David Simchi-Levi & Li Wang & Xinshang Wang & Sen Yang, 2021. "Shrinking the Upper Confidence Bound: A Dynamic Product Selection Problem for Urban Warehouses," Management Science, INFORMS, vol. 67(8), pages 4756-4771, August.
    9. Michael Lechner, 2023. "Causal Machine Learning and its use for public policy," Swiss Journal of Economics and Statistics, Springer;Swiss Society of Economics and Statistics, vol. 159(1), pages 1-15, December.
    10. Ruoxuan Xiong & Markus Pelger, 2019. "Large Dimensional Latent Factor Modeling with Missing Observations and Applications to Causal Inference," Papers 1910.08273, arXiv.org, revised Jan 2022.
    11. Mou, Shandong & Robb, David J. & DeHoratius, Nicole, 2018. "Retail store operations: Literature review and research directions," European Journal of Operational Research, Elsevier, vol. 265(2), pages 399-422.
    12. Jason Poulos & Andrea Albanese & Andrea Mercatanti & Fan Li, 2021. "Retrospective causal inference via matrix completion, with an evaluation of the effect of European integration on cross-border employment," Papers 2106.00788, arXiv.org.
    13. Agrawal, Priyank & Tulabandhula, Theja & Avadhanula, Vashist, 2023. "A tractable online learning algorithm for the multinomial logit contextual bandit," European Journal of Operational Research, Elsevier, vol. 310(2), pages 737-750.
    14. Kock, Anders Bredahl & Preinerstorfer, David & Veliyev, Bezirgen, 2023. "Treatment recommendation with distributional targets," Journal of Econometrics, Elsevier, vol. 234(2), pages 624-646.
    15. Jonas Radbruch & Amelie Schiprowski, 2024. "Interview Sequences and the Formation of Subjective Assessments," Rationality and Competition Discussion Paper Series 497, CRC TRR 190 Rationality and Competition.
    16. Victor Chernozhukov & Kaspar Wüthrich & Yinchu Zhu, 2021. "An Exact and Robust Conformal Inference Method for Counterfactual and Synthetic Controls," Journal of the American Statistical Association, Taylor & Francis Journals, vol. 116(536), pages 1849-1864, October.
    17. Gel, Esma S. & Salman, F. Sibel, 2022. "Dynamic ordering decisions with approximate learning of supply yield uncertainty," International Journal of Production Economics, Elsevier, vol. 243(C).
    18. Yining Wang & Boxiao Chen & David Simchi-Levi, 2021. "Multimodal Dynamic Pricing," Management Science, INFORMS, vol. 67(10), pages 6136-6152, October.
    19. Kimia Keshanian & Daniel Zantedeschi & Kaushik Dutta, 2022. "Features Selection as a Nash-Bargaining Solution: Applications in Online Advertising and Information Systems," INFORMS Journal on Computing, INFORMS, vol. 34(5), pages 2485-2501, September.
    20. Dipankar Das, 2023. "A Model of Competitive Assortment Planning Algorithm," Papers 2307.09479, arXiv.org.

    More about this item

    NEP fields

    This paper has been announced in the following NEP Reports:

    Statistics

    Access and download statistics

    Corrections

    All material on this site has been provided by the respective publishers and authors. You can help correct errors and omissions. When requesting a correction, please mention this item's handle: RePEc:arx:papers:2009.13961. See general information about how to correct material in RePEc.

    If you have authored this item and are not yet registered with RePEc, we encourage you to do it here. This allows to link your profile to this item. It also allows you to accept potential citations to this item that we are uncertain about.

    If CitEc recognized a bibliographic reference but did not link an item in RePEc to it, you can help with this form .

    If you know of missing items citing this one, you can help us creating those links by adding the relevant references in the same way as above, for each refering item. If you are a registered author of this item, you may also want to check the "citations" tab in your RePEc Author Service profile, as there may be some citations waiting for confirmation.

    For technical questions regarding this item, or to correct its authors, title, abstract, bibliographic or download information, contact: arXiv administrators (email available below). General contact details of provider: http://arxiv.org/ .

    Please note that corrections may take a couple of weeks to filter through the various RePEc services.

    IDEAS is a RePEc service. RePEc uses bibliographic data supplied by the respective publishers.