IDEAS home Printed from https://ideas.repec.org/p/arx/papers/1904.01047.html
   My bibliography  Save this paper

Dynamically Optimal Treatment Allocation using Reinforcement Learning

Author

Listed:
  • Karun Adusumilli
  • Friedrich Geiecke
  • Claudio Schilter

Abstract

Dynamic decisions are pivotal to economic policy making. We show how existing evidence from randomized control trials can be utilized to guide personalized decisions in challenging dynamic environments with constraints such as limited budget or queues. Recent developments in reinforcement learning make it possible to solve many realistically complex settings for the first time. We allow for restricted policy functions and prove that their regret decays at rate $n^{-1/2}$, the same as in the static case. We illustrate our methods with an application to job training. The approach scales to a wide range of important problems faced by policy makers.

Suggested Citation

  • Karun Adusumilli & Friedrich Geiecke & Claudio Schilter, 2019. "Dynamically Optimal Treatment Allocation using Reinforcement Learning," Papers 1904.01047, arXiv.org, revised May 2022.
  • Handle: RePEc:arx:papers:1904.01047
    as

    Download full text from publisher

    File URL: http://arxiv.org/pdf/1904.01047
    File Function: Latest version
    Download Restriction: no
    ---><---

    References listed on IDEAS

    as
    1. Stoye, Jörg, 2009. "Minimax regret treatment choice with finite samples," Journal of Econometrics, Elsevier, vol. 151(1), pages 70-81, July.
    2. Keisuke Hirano & Jack R. Porter, 2009. "Asymptotics for Statistical Treatment Rules," Econometrica, Econometric Society, vol. 77(5), pages 1683-1701, September.
    3. Victor Chernozhukov & Denis Chetverikov & Mert Demirer & Esther Duflo & Christian Hansen & Whitney Newey & James Robins, 2018. "Double/debiased machine learning for treatment and structural parameters," Econometrics Journal, Royal Economic Society, vol. 21(1), pages 1-68, February.
    4. Alberto Abadie & Joshua Angrist & Guido Imbens, 2002. "Instrumental Variables Estimates of the Effect of Subsidized Training on the Quantiles of Trainee Earnings," Econometrica, Econometric Society, vol. 70(1), pages 91-117, January.
    5. Yves Achdou & Jiequn Han & Jean-Michel Lasry & Pierre-Louis Lions & Benjamin Moll, 2017. "Income and Wealth Distribution in Macroeconomics: A Continuous-Time Approach," NBER Working Papers 23732, National Bureau of Economic Research, Inc.
    6. Toru Kitagawa & Aleksey Tetenov, 2018. "Who Should Be Treated? Empirical Welfare Maximization Methods for Treatment Choice," Econometrica, Econometric Society, vol. 86(2), pages 591-616, March.
    7. Jörg Stoye, 2012. "New Perspectives on Statistical Decisions Under Ambiguity," Annual Review of Economics, Annual Reviews, vol. 4(1), pages 257-282, July.
    8. Geweke, John & Koop, Gary & van Dijk, Herman (ed.), 2011. "The Oxford Handbook of Bayesian Econometrics," OUP Catalogue, Oxford University Press, number 9780199559084.
    9. Bhattacharya, Debopam & Dupas, Pascaline, 2012. "Inferring welfare maximizing treatment assignment under budget constraints," Journal of Econometrics, Elsevier, vol. 167(1), pages 168-196.
    10. Charles F. Manski, 2004. "Statistical Treatment Rules for Heterogeneous Populations," Econometrica, Econometric Society, vol. 72(4), pages 1221-1246, July.
    11. Tetenov, Aleksey, 2012. "Statistical treatment choice based on asymmetric minimax regret criteria," Journal of Econometrics, Elsevier, vol. 166(1), pages 157-165.
    12. A. Belloni & V. Chernozhukov & I. Fernández‐Val & C. Hansen, 2017. "Program Evaluation and Causal Inference With High‐Dimensional Data," Econometrica, Econometric Society, vol. 85, pages 233-298, January.
    13. Shakeeb Khan & Elie Tamer, 2010. "Irregular Identification, Support Conditions, and Inverse Weight Estimation," Econometrica, Econometric Society, vol. 78(6), pages 2021-2042, November.
    14. Hugo Benitez-Silva & John Rust & Gunter Hitsch & Giorgio Pauletto & George Hall, 2000. "A Comparison Of Discrete And Parametric Methods For Continuous-State Dynamic Programming Problems," Computing in Economics and Finance 2000 24, Society for Computational Economics.
    Full references (including those not matched with items on IDEAS)

    Most related items

    These are the items that most often cite the same works as this one and are cited by the same works as this one.
    1. Huber, Martin, 2019. "An introduction to flexible methods for policy evaluation," FSES Working Papers 504, Faculty of Economics and Social Sciences, University of Freiburg/Fribourg Switzerland.
    2. Susan Athey & Stefan Wager, 2021. "Policy Learning With Observational Data," Econometrica, Econometric Society, vol. 89(1), pages 133-161, January.
    3. Toru Kitagawa & Aleksey Tetenov, 2017. "Equality-minded treatment choice," CeMMAP working papers 10/17, Institute for Fiscal Studies.
    4. Toru Kitagawa & Aleksey Tetenov, 2021. "Equality-Minded Treatment Choice," Journal of Business & Economic Statistics, Taylor & Francis Journals, vol. 39(2), pages 561-574, March.
    5. Davide Viviano, 2019. "Policy Targeting under Network Interference," Papers 1906.10258, arXiv.org, revised Apr 2024.
    6. Eric Mbakop & Max Tabord‐Meehan, 2021. "Model Selection for Treatment Choice: Penalized Welfare Maximization," Econometrica, Econometric Society, vol. 89(2), pages 825-848, March.
    7. Anders Bredahl Kock & Martin Thyrsgaard, 2017. "Optimal sequential treatment allocation," Papers 1705.09952, arXiv.org, revised Aug 2018.
    8. Garbero, Alessandra & Sakos, Grayson & Cerulli, Giovanni, 2023. "Towards data-driven project design: Providing optimal treatment rules for development projects," Socio-Economic Planning Sciences, Elsevier, vol. 89(C).
    9. Firpo, Sergio & Galvao, Antonio F. & Kobus, Martyna & Parker, Thomas & Rosa-Dias, Pedro, 2020. "Loss Aversion and the Welfare Ranking of Policy Interventions," IZA Discussion Papers 13176, Institute of Labor Economics (IZA).
    10. Toru Kitagawa & Hugo Lopez & Jeff Rowley, 2022. "Stochastic Treatment Choice with Empirical Welfare Updating," Papers 2211.01537, arXiv.org, revised Feb 2023.
    11. Shosei Sakaguchi, 2021. "Estimation of Optimal Dynamic Treatment Assignment Rules under Policy Constraints," Papers 2106.05031, arXiv.org, revised Apr 2024.
    12. Keisuke Hirano & Jack R. Porter, 2016. "Panel Asymptotics and Statistical Decision Theory," The Japanese Economic Review, Japanese Economic Association, vol. 67(1), pages 33-49, March.
    13. Yuya Sasaki & Takuya Ura, 2020. "Welfare Analysis via Marginal Treatment Effects," Papers 2012.07624, arXiv.org.
    14. Davide Viviano & Jess Rudder, 2020. "Policy design in experiments with unknown interference," Papers 2011.08174, arXiv.org, revised Dec 2023.
    15. Anders Bredahl Kock & David Preinerstorfer, 2024. "Regularizing Discrimination in Optimal Policy Learning with Distributional Targets," Papers 2401.17909, arXiv.org.
    16. Anders Bredahl Kock & David Preinerstorfer & Bezirgen Veliyev, 2022. "Functional Sequential Treatment Allocation," Journal of the American Statistical Association, Taylor & Francis Journals, vol. 117(539), pages 1311-1323, September.
    17. Toru Kitagawa & Guanyi Wang, 2021. "Who should get vaccinated? Individualized allocation of vaccines over SIR network," CeMMAP working papers CWP28/21, Centre for Microdata Methods and Practice, Institute for Fiscal Studies.
    18. Kitagawa, Toru & Wang, Guanyi, 2023. "Who should get vaccinated? Individualized allocation of vaccines over SIR network," Journal of Econometrics, Elsevier, vol. 232(1), pages 109-131.
    19. Chunrong Ai & Yue Fang & Haitian Xie, 2024. "Data-driven Policy Learning for a Continuous Treatment," Papers 2402.02535, arXiv.org.
    20. Toru Kitagawa & Sokbae Lee & Chen Qiu, 2022. "Treatment Choice with Nonlinear Regret," Papers 2205.08586, arXiv.org, revised Feb 2024.

    More about this item

    NEP fields

    This paper has been announced in the following NEP Reports:

    Statistics

    Access and download statistics

    Corrections

    All material on this site has been provided by the respective publishers and authors. You can help correct errors and omissions. When requesting a correction, please mention this item's handle: RePEc:arx:papers:1904.01047. See general information about how to correct material in RePEc.

    If you have authored this item and are not yet registered with RePEc, we encourage you to do it here. This allows to link your profile to this item. It also allows you to accept potential citations to this item that we are uncertain about.

    If CitEc recognized a bibliographic reference but did not link an item in RePEc to it, you can help with this form .

    If you know of missing items citing this one, you can help us creating those links by adding the relevant references in the same way as above, for each refering item. If you are a registered author of this item, you may also want to check the "citations" tab in your RePEc Author Service profile, as there may be some citations waiting for confirmation.

    For technical questions regarding this item, or to correct its authors, title, abstract, bibliographic or download information, contact: arXiv administrators (email available below). General contact details of provider: http://arxiv.org/ .

    Please note that corrections may take a couple of weeks to filter through the various RePEc services.

    IDEAS is a RePEc service. RePEc uses bibliographic data supplied by the respective publishers.