Robust Learning for Optimal Dynamic Treatment Regimes with Observational Data

My bibliography Save this paper

Robust Learning for Optimal Dynamic Treatment Regimes with Observational Data

Author

Listed:

Shosei Sakaguchi

Registered:

Abstract

Many public policies and medical interventions involve dynamics in their treatment assignments, where treatments are sequentially assigned to the same individuals across multiple stages, and the effect of treatment at each stage is usually heterogeneous with respect to the history of prior treatments and associated characteristics. We study statistical learning of optimal dynamic treatment regimes (DTRs) that guide the optimal treatment assignment for each individual at each stage based on the individual's history. We propose a step-wise doubly-robust approach to learn the optimal DTR using observational data under the assumption of sequential ignorability. The approach solves the sequential treatment assignment problem through backward induction, where, at each step, we combine estimators of propensity scores and action-value functions (Q-functions) to construct augmented inverse probability weighting estimators of values of policies for each stage. The approach consistently estimates the optimal DTR if either a propensity score or Q-function for each stage is consistently estimated. Furthermore, the resulting DTR can achieve the optimal convergence rate $n^{-1/2}$ of regret under mild conditions on the convergence rate for estimators of the nuisance parameters.

Suggested Citation

Shosei Sakaguchi, 2024. "Robust Learning for Optimal Dynamic Treatment Regimes with Observational Data," Papers 2404.00221, arXiv.org.

Handle: RePEc:arx:papers:2404.00221

Download full text from publisher

References listed on IDEAS

Nathan Kallus, 2021. "More Efficient Policy Learning via Optimal Retargeting," Journal of the American Statistical Association, Taylor & Francis Journals, vol. 116(534), pages 646-658, April.
Xiao Liu, 2023. "Dynamic Coupon Targeting Using Batch Deep Reinforcement Learning: An Application to Livestream Shopping," Marketing Science, INFORMS, vol. 42(4), pages 637-658, July.
Alan B. Krueger, 1999. "Experimental Estimates of Education Production Functions," The Quarterly Journal of Economics, President and Fellows of Harvard College, vol. 114(2), pages 497-532.
- Alan B. Krueger, 1997. "Experimental Estimates of Education Production Functions," NBER Working Papers 6051, National Bureau of Economic Research, Inc.
- Alan B. Krueger, 1997. "Experimental Estimates of Education Production Functions," Working Papers 758, Princeton University, Department of Economics, Industrial Relations Section..
Weili Ding & Steven F. Lehrer, 2010. "Estimating Treatment Effects from Contaminated Multiperiod Education Experiments: The Dynamic Impacts of Class Size Reductions," The Review of Economics and Statistics, MIT Press, vol. 92(1), pages 31-42, February.
- Ding, Weili & Lehrer, Steven F., 2009. "Estimating Treatment Effects from Contaminated Multi-Period Education Experiments: The Dynamic Impacts of Class Size Reductions," CLSSRN working papers clsrn_admin-2009-43, Vancouver School of Economics, revised 22 Jul 2009.
- Weili Ding & Steven F. Lehrer, 2009. "Estimating Treatment Effects from Contaminated Multi-Period Education Experiments: The Dynamic Impacts of Class Size Reductions," NBER Working Papers 15200, National Bureau of Economic Research, Inc.
Victor Chernozhukov & Denis Chetverikov & Mert Demirer & Esther Duflo & Christian Hansen & Whitney Newey & James Robins, 2018. "Double/debiased machine learning for treatment and structural parameters," Econometrics Journal, Royal Economic Society, vol. 21(1), pages 1-68, February.
- Victor Chernozhukov & Denis Chetverikov & Mert Demirer & Esther Duflo & Christian Hansen & Whitney Newey & James Robins, 2017. "Double/Debiased Machine Learning for Treatment and Structural Parameters," NBER Working Papers 23564, National Bureau of Economic Research, Inc.
- Victor Chernozhukov & Denis Chetverikov & Mert Demirer & Esther Duflo & Christian Hansen & Whitney K. Newey & James Robins, 2017. "Double/debiased machine learning for treatment and structural parameters," CeMMAP working papers CWP28/17, Centre for Microdata Methods and Practice, Institute for Fiscal Studies.
- Victor Chernozhukov & Denis Chetverikov & Mert Demirer & Esther Duflo & Christian Hansen & Whitney K. Newey & James Robins, 2017. "Double/debiased machine learning for treatment and structural parameters," CeMMAP working papers 28/17, Institute for Fiscal Studies.
E. B. Laber & Y. Q. Zhao, 2015. "Tree-based methods for individualized treatment regimes," Biometrika, Biometrika Trust, vol. 102(3), pages 501-514.
Toru Kitagawa & Aleksey Tetenov, 2018. "Who Should Be Treated? Empirical Welfare Maximization Methods for Treatment Choice," Econometrica, Econometric Society, vol. 86(2), pages 591-616, March.
- Toru Kitagawa & Aleksey Tetenov, 2015. "Who should be treated? Empirical welfare maximization methods for treatment choice," CeMMAP working papers CWP10/15, Centre for Microdata Methods and Practice, Institute for Fiscal Studies.
- Toru Kitagawa & Aleksey Tetenov, 2017. "Who should be treated? Empirical welfare maximization methods for treatment choice," CeMMAP working papers CWP24/17, Centre for Microdata Methods and Practice, Institute for Fiscal Studies.
- Toru Kitagawa & Aleksey Tetenov, 2015. "Who should be Treated? Empirical Welfare Maximization Methods for Treatment Choice," Carlo Alberto Notebooks 402, Collegio Carlo Alberto.
Raj Chetty & John N. Friedman & Nathaniel Hilger & Emmanuel Saez & Diane Whitmore Schanzenbach & Danny Yagan, 2011. "How Does Your Kindergarten Classroom Affect Your Earnings? Evidence from Project Star," The Quarterly Journal of Economics, President and Fellows of Harvard College, vol. 126(4), pages 1593-1660.
- Raj Chetty & John N. Friedman & Nathaniel Hilger & Emmanuel Saez & Diane Whitmore Schanzenbach & Danny Yagan, 2010. "How Does Your Kindergarten Classroom Affect Your Earnings? Evidence From Project STAR," NBER Working Papers 16381, National Bureau of Economic Research, Inc.
- Chetty, Raj & Friedman, John Norton & Hilger, Nathanial & Saez, Emmanuel & Schanzenbach, Dianne Whitmore & Yagan, Danny, 2011. "How Does Your Kindergarten Classroom Affect Your Earnings? Evidence from Project Star," Scholarly Articles 9639983, Harvard Kennedy School of Government.
Michael P. Wallace & Erica E. M. Moodie, 2015. "Doubly‐robust dynamic treatment regimen estimation via weighted least squares," Biometrics, The International Biometric Society, vol. 71(3), pages 636-644, September.
Jorge Rodríguez & Fernando Saltiel & Sergio Urzúa, 2022. "Dynamic treatment effects of job training," Journal of Applied Econometrics, John Wiley & Sons, Ltd., vol. 37(2), pages 242-269, March.
- Jorge Rodríguez & Fernando Saltiel & Sergio S. Urzúa, 2018. "Dynamic Treatment Effects of Job Training," NBER Working Papers 25408, National Bureau of Economic Research, Inc.
Xinkun Nie & Emma Brunskill & Stefan Wager, 2021. "Learning When-to-Treat Policies," Journal of the American Statistical Association, Taylor & Francis Journals, vol. 116(533), pages 392-409, January.
Charles F. Manski, 2004. "Statistical Treatment Rules for Heterogeneous Populations," Econometrica, Econometric Society, vol. 72(4), pages 1221-1246, July.
- Charles F. Manski, 2003. "Statistical treatment rules for heterogeneous populations," CeMMAP working papers 03/03, Institute for Fiscal Studies.
- Charles F. Manski, 2003. "Statistical treatment rules for heterogeneous populations," CeMMAP working papers CWP03/03, Centre for Microdata Methods and Practice, Institute for Fiscal Studies.
S. A. Murphy, 2003. "Optimal dynamic treatment regimes," Journal of the Royal Statistical Society Series B, Royal Statistical Society, vol. 65(2), pages 331-355, May.
Baqun Zhang & Anastasios A. Tsiatis & Eric B. Laber & Marie Davidian, 2013. "Robust estimation of optimal dynamic treatment regimes for sequential treatment decisions," Biometrika, Biometrika Trust, vol. 100(3), pages 681-694.
Lechner, Michael, 2009. "Sequential Causal Models for the Evaluation of Labor Market Programs," Journal of Business & Economic Statistics, American Statistical Association, vol. 27, pages 71-83.
Yilun Sun & Lu Wang, 2021. "Stochastic Tree Search for Estimating Optimal Dynamic Treatment Regimes," Journal of the American Statistical Association, Taylor & Francis Journals, vol. 116(533), pages 421-432, January.
Raphael Fonteneau & Susan Murphy & Louis Wehenkel & Damien Ernst, 2013. "Batch mode reinforcement learning based on the synthesis of artificial trajectories," Annals of Operations Research, Springer, vol. 208(1), pages 383-416, September.

Full references (including those not matched with items on IDEAS)

Most related items

These are the items that most often cite the same works as this one and are cited by the same works as this one.

Shosei Sakaguchi, 2021. "Estimation of Optimal Dynamic Treatment Assignment Rules under Policy Constraints," Papers 2106.05031, arXiv.org, revised Apr 2024.
Susan Athey & Stefan Wager, 2021. "Policy Learning With Observational Data," Econometrica, Econometric Society, vol. 89(1), pages 133-161, January.
- Susan Athey & Stefan Wager, 2017. "Policy Learning with Observational Data," Papers 1702.02896, arXiv.org, revised Sep 2020.
Weibin Mo & Yufeng Liu, 2022. "Efficient learning of optimal individualized treatment rules for heteroscedastic or misspecified treatment‐free effect models," Journal of the Royal Statistical Society Series B, Royal Statistical Society, vol. 84(2), pages 440-472, April.
Huber, Martin, 2019. "An introduction to flexible methods for policy evaluation," FSES Working Papers 504, Faculty of Economics and Social Sciences, University of Freiburg/Fribourg Switzerland.
- Martin Huber, 2019. "An introduction to flexible methods for policy evaluation," Papers 1910.00641, arXiv.org.
Davide Viviano & Jess Rudder, 2020. "Policy design in experiments with unknown interference," Papers 2011.08174, arXiv.org, revised May 2024.
Davide Viviano, 2019. "Policy Targeting under Network Interference," Papers 1906.10258, arXiv.org, revised Apr 2024.
Shi, Chengchun & Luo, Shikai & Le, Yuan & Zhu, Hongtu & Song, Rui, 2022. "Statistically efficient advantage learning for offline reinforcement learning in infinite horizons," LSE Research Online Documents on Economics 115598, London School of Economics and Political Science, LSE Library.
Chunrong Ai & Yue Fang & Haitian Xie, 2024. "Data-driven Policy Learning for a Continuous Treatment," Papers 2402.02535, arXiv.org.
Julia Hatamyar & Noemi Kreif, 2023. "Policy Learning with Rare Outcomes," Papers 2302.05260, arXiv.org, revised Oct 2023.
Q. Clairon & R. Henderson & N. J. Young & E. D. Wilson & C. J. Taylor, 2021. "Adaptive treatment and robust control," Biometrics, The International Biometric Society, vol. 77(1), pages 223-236, March.
Kyle Colangelo & Ying-Ying Lee, 2019. "Double debiased machine learning nonparametric inference with continuous treatments," CeMMAP working papers CWP72/19, Centre for Microdata Methods and Practice, Institute for Fiscal Studies.
Jelena Bradic & Weijie Ji & Yuqian Zhang, 2021. "High-dimensional Inference for Dynamic Treatment Effects," Papers 2110.04924, arXiv.org, revised May 2023.
Anders Bredahl Kock & Martin Thyrsgaard, 2017. "Optimal sequential treatment allocation," Papers 1705.09952, arXiv.org, revised Aug 2018.
Timothy B. Armstrong & Shu Shen, 2013. "Inference on Optimal Treatment Assignments," Cowles Foundation Discussion Papers 1927RR, Cowles Foundation for Research in Economics, Yale University, revised Apr 2015.
- Timothy B. Armstrong & Shu Shen, 2013. "Inference on Optimal Treatment Assignments," Cowles Foundation Discussion Papers 1927, Cowles Foundation for Research in Economics, Yale University.
- Timothy B. Armstrong & Shu Shen, 2013. "Inference on Optimal Treatment Assignments," Cowles Foundation Discussion Papers 1927R, Cowles Foundation for Research in Economics, Yale University, revised Apr 2014.
Ruoqing Zhu & Ying-Qi Zhao & Guanhua Chen & Shuangge Ma & Hongyu Zhao, 2017. "Greedy outcome weighted tree learning of optimal personalized treatment rules," Biometrics, The International Biometric Society, vol. 73(2), pages 391-400, June.
Michael C Knaus, 2022. "Double machine learning-based programme evaluation under unconfoundedness [Econometric methods for program evaluation]," The Econometrics Journal, Royal Economic Society, vol. 25(3), pages 602-627.
- Knaus, Michael C., 2020. "Double Machine Learning based Program Evaluation under Unconfoundedness," Economics Working Paper Series 2004, University of St. Gallen, School of Economics and Political Science.
- Knaus, Michael C., 2020. "Double Machine Learning Based Program Evaluation under Unconfoundedness," IZA Discussion Papers 13051, Institute of Labor Economics (IZA).
- Michael C. Knaus, 2020. "Double Machine Learning based Program Evaluation under Unconfoundedness," Papers 2003.03191, arXiv.org, revised Jun 2022.
Kyle Colangelo & Ying-Ying Lee, 2020. "Double Debiased Machine Learning Nonparametric Inference with Continuous Treatments," Papers 2004.03036, arXiv.org, revised Sep 2023.
Henrika Langen & Martin Huber, 2022. "How causal machine learning can leverage marketing strategies: Assessing and improving the performance of a coupon campaign," Papers 2204.10820, arXiv.org, revised Jun 2022.
Toru Kitagawa & Weining Wang & Mengshan Xu, 2022. "Policy Choice in Time Series by Empirical Welfare Maximization," Papers 2205.03970, arXiv.org, revised Jun 2023.
Augustine Denteh & Helge Liebert, 2022. "Who Increases Emergency Department Use? New Insights from the Oregon Health Insurance Experiment," Working Papers 2201, Tulane University, Department of Economics.
- Augustine Denteh & Helge Liebert, 2022. "Who Increases Emergency Department Use? New Insights from the Oregon Health Insurance Experiment," CESifo Working Paper Series 9664, CESifo.
- Denteh, Augustine & Liebert, Helge, 2022. "Who Increases Emergency Department Use? New Insights from the Oregon Health Insurance Experiment," IZA Discussion Papers 15192, Institute of Labor Economics (IZA).
- Augustine Denteh & Helge Liebert, 2022. "Who Increases Emergency Department Use? New Insights from the Oregon Health Insurance Experiment," Papers 2201.07072, arXiv.org, revised Apr 2023.

More about this item

Statistics

Access and download statistics

Corrections

All material on this site has been provided by the respective publishers and authors. You can help correct errors and omissions. When requesting a correction, please mention this item's handle: RePEc:arx:papers:2404.00221. See general information about how to correct material in RePEc.

If you have authored this item and are not yet registered with RePEc, we encourage you to do it here. This allows to link your profile to this item. It also allows you to accept potential citations to this item that we are uncertain about.

If CitEc recognized a bibliographic reference but did not link an item in RePEc to it, you can help with this form .

If you know of missing items citing this one, you can help us creating those links by adding the relevant references in the same way as above, for each refering item. If you are a registered author of this item, you may also want to check the "citations" tab in your RePEc Author Service profile, as there may be some citations waiting for confirmation.

For technical questions regarding this item, or to correct its authors, title, abstract, bibliographic or download information, contact: arXiv administrators (email available below). General contact details of provider: http://arxiv.org/ .

Please note that corrections may take a couple of weeks to filter through the various RePEc services.

IDEAS is a RePEc service. RePEc uses bibliographic data supplied by the respective publishers.

Browse Econ Literature

More features

Robust Learning for Optimal Dynamic Treatment Regimes with Observational Data

Author

Abstract

Suggested Citation

Download full text from publisher

References listed on IDEAS

Most related items

More about this item

Statistics

Corrections

More services and features

MyIDEAS

Author registration

Rankings

RePEc Genealogy

RePEc Biblio

MPRA

New papers by email

EconAcademics

Plagiarism

About RePEc

RePEc home

Blog

Help/FAQ

RePEc team

Participating archives

Privacy statement

Help us

Corrections

Volunteers

Get papers listed

Open a RePEc archive

Get RePEc data