IDEAS home Printed from https://ideas.repec.org/p/arx/papers/2303.04416.html
   My bibliography  Save this paper

Inference on Optimal Dynamic Policies via Softmax Approximation

Author

Listed:
  • Qizhao Chen
  • Morgane Austern
  • Vasilis Syrgkanis

Abstract

Estimating optimal dynamic policies from offline data is a fundamental problem in dynamic decision making. In the context of causal inference, the problem is known as estimating the optimal dynamic treatment regime. Even though there exists a plethora of methods for estimation, constructing confidence intervals for the value of the optimal regime and structural parameters associated with it is inherently harder, as it involves non-linear and non-differentiable functionals of unknown quantities that need to be estimated. Prior work resorted to sub-sample approaches that can deteriorate the quality of the estimate. We show that a simple soft-max approximation to the optimal treatment regime, for an appropriately fast growing temperature parameter, can achieve valid inference on the truly optimal regime. We illustrate our result for a two-period optimal dynamic regime, though our approach should directly extend to the finite horizon case. Our work combines techniques from semi-parametric inference and $g$-estimation, together with an appropriate triangular array central limit theorem, as well as a novel analysis of the asymptotic influence and asymptotic bias of softmax approximations.

Suggested Citation

  • Qizhao Chen & Morgane Austern & Vasilis Syrgkanis, 2023. "Inference on Optimal Dynamic Policies via Softmax Approximation," Papers 2303.04416, arXiv.org, revised Dec 2023.
  • Handle: RePEc:arx:papers:2303.04416
    as

    Download full text from publisher

    File URL: http://arxiv.org/pdf/2303.04416
    File Function: Latest version
    Download Restriction: no
    ---><---

    References listed on IDEAS

    as
    1. Erica E. M. Moodie & Thomas S. Richardson & David A. Stephens, 2007. "Demystifying Optimal Dynamic Treatment Regimes," Biometrics, The International Biometric Society, vol. 63(2), pages 447-455, June.
    2. S. A. Murphy, 2003. "Optimal dynamic treatment regimes," Journal of the Royal Statistical Society Series B, Royal Statistical Society, vol. 65(2), pages 331-355, May.
    3. Emmanuel Rio, 2009. "Moment Inequalities for Sums of Dependent Random Variables under Projective Conditions," Journal of Theoretical Probability, Springer, vol. 22(1), pages 146-163, March.
    4. Victor Chernozhukov & Whitney Newey & Rahul Singh & Vasilis Syrgkanis, 2020. "Adversarial Estimation of Riesz Representers," Papers 2101.00009, arXiv.org, revised Apr 2024.
    Full references (including those not matched with items on IDEAS)

    Citations

    Citations are extracted by the CitEc Project, subscribe to its RSS feed for this item.
    as


    Cited by:

    1. Gyungbae Park, 2024. "Debiased Machine Learning when Nuisance Parameters Appear in Indicator Functions," Papers 2403.15934, arXiv.org.

    Most related items

    These are the items that most often cite the same works as this one and are cited by the same works as this one.
    1. Q. Clairon & R. Henderson & N. J. Young & E. D. Wilson & C. J. Taylor, 2021. "Adaptive treatment and robust control," Biometrics, The International Biometric Society, vol. 77(1), pages 223-236, March.
    2. Luo, Yu & Graham, Daniel J. & McCoy, Emma J., 2023. "Semiparametric Bayesian doubly robust causal estimation," LSE Research Online Documents on Economics 117944, London School of Economics and Political Science, LSE Library.
    3. Rich Benjamin & Moodie Erica E. M. & A. Stephens David, 2016. "Influence Re-weighted G-Estimation," The International Journal of Biostatistics, De Gruyter, vol. 12(1), pages 157-177, May.
    4. Lingyun Lyu & Yu Cheng & Abdus S. Wahed, 2023. "Imputation‐based Q‐learning for optimizing dynamic treatment regimes with right‐censored survival outcome," Biometrics, The International Biometric Society, vol. 79(4), pages 3676-3689, December.
    5. Peng Wu & Donglin Zeng & Haoda Fu & Yuanjia Wang, 2020. "On using electronic health records to improve optimal treatment rules in randomized trials," Biometrics, The International Biometric Society, vol. 76(4), pages 1075-1086, December.
    6. Xiaofei Bai & Anastasios A. Tsiatis & Wenbin Lu & Rui Song, 2017. "Optimal treatment regimes for survival endpoints using a locally-efficient doubly-robust estimator from a classification perspective," Lifetime Data Analysis: An International Journal Devoted to Statistical Methods and Applications for Time-to-Event Data, Springer, vol. 23(4), pages 585-604, October.
    7. Wei Liu & Zhiwei Zhang & Lei Nie & Guoxing Soon, 2017. "A Case Study in Personalized Medicine: Rilpivirine Versus Efavirenz for Treatment-Naive HIV Patients," Journal of the American Statistical Association, Taylor & Francis Journals, vol. 112(520), pages 1381-1392, October.
    8. Sies Aniek & Van Mechelen Iven, 2017. "Comparing Four Methods for Estimating Tree-Based Treatment Regimes," The International Journal of Biostatistics, De Gruyter, vol. 13(1), pages 1-20, May.
    9. Hongming Pu & Bo Zhang, 2021. "Estimating optimal treatment rules with an instrumental variable: A partial identification learning approach," Journal of the Royal Statistical Society Series B, Royal Statistical Society, vol. 83(2), pages 318-345, April.
    10. Qizhao Chen & Vasilis Syrgkanis & Morgane Austern, 2022. "Debiased Machine Learning without Sample-Splitting for Stable Estimators," Papers 2206.01825, arXiv.org, revised Nov 2022.
    11. Tsai Kao-Tai & Peace Karl, 2013. "Analysis of Subgroup Data of Clinical Trials," Journal of Causal Inference, De Gruyter, vol. 1(2), pages 193-207, September.
    12. Jin Wang & Donglin Zeng & D. Y. Lin, 2022. "Semiparametric single-index models for optimal treatment regimens with censored outcomes," Lifetime Data Analysis: An International Journal Devoted to Statistical Methods and Applications for Time-to-Event Data, Springer, vol. 28(4), pages 744-763, October.
    13. Shonosuke Sugasawa & Hisashi Noma, 2021. "Efficient screening of predictive biomarkers for individual treatment selection," Biometrics, The International Biometric Society, vol. 77(1), pages 249-257, March.
    14. Jingxiang Chen & Yufeng Liu & Donglin Zeng & Rui Song & Yingqi Zhao & Michael R. Kosorok, 2016. "Comment," Journal of the American Statistical Association, Taylor & Francis Journals, vol. 111(515), pages 942-947, July.
    15. Jelena Bradic & Weijie Ji & Yuqian Zhang, 2021. "High-dimensional Inference for Dynamic Treatment Effects," Papers 2110.04924, arXiv.org, revised May 2023.
    16. Han, Sukjin, 2021. "Identification in nonparametric models for dynamic treatment effects," Journal of Econometrics, Elsevier, vol. 225(2), pages 132-147.
    17. Durlauf, Steven N. & Navarro, Salvador & Rivers, David A., 2016. "Model uncertainty and the effect of shall-issue right-to-carry laws on crime," European Economic Review, Elsevier, vol. 81(C), pages 32-67.
    18. Michael C Knaus & Michael Lechner & Anthony Strittmatter, 2021. "Machine learning estimation of heterogeneous causal effects: Empirical Monte Carlo evidence," The Econometrics Journal, Royal Economic Society, vol. 24(1), pages 134-161.
    19. Yufan Zhao & Donglin Zeng & Mark A. Socinski & Michael R. Kosorok, 2011. "Reinforcement Learning Strategies for Clinical Trials in Nonsmall Cell Lung Cancer," Biometrics, The International Biometric Society, vol. 67(4), pages 1422-1433, December.
    20. Anders Bredahl Kock & Martin Thyrsgaard, 2017. "Optimal sequential treatment allocation," Papers 1705.09952, arXiv.org, revised Aug 2018.

    More about this item

    NEP fields

    This paper has been announced in the following NEP Reports:

    Statistics

    Access and download statistics

    Corrections

    All material on this site has been provided by the respective publishers and authors. You can help correct errors and omissions. When requesting a correction, please mention this item's handle: RePEc:arx:papers:2303.04416. See general information about how to correct material in RePEc.

    If you have authored this item and are not yet registered with RePEc, we encourage you to do it here. This allows to link your profile to this item. It also allows you to accept potential citations to this item that we are uncertain about.

    If CitEc recognized a bibliographic reference but did not link an item in RePEc to it, you can help with this form .

    If you know of missing items citing this one, you can help us creating those links by adding the relevant references in the same way as above, for each refering item. If you are a registered author of this item, you may also want to check the "citations" tab in your RePEc Author Service profile, as there may be some citations waiting for confirmation.

    For technical questions regarding this item, or to correct its authors, title, abstract, bibliographic or download information, contact: arXiv administrators (email available below). General contact details of provider: http://arxiv.org/ .

    Please note that corrections may take a couple of weeks to filter through the various RePEc services.

    IDEAS is a RePEc service. RePEc uses bibliographic data supplied by the respective publishers.