IDEAS home Printed from https://ideas.repec.org/p/arx/papers/2507.11780.html
   My bibliography  Save this paper

Inference on Optimal Policy Values and Other Irregular Functionals via Smoothing

Author

Listed:
  • Justin Whitehouse
  • Morgane Austern
  • Vasilis Syrgkanis

Abstract

Constructing confidence intervals for the value of an optimal treatment policy is an important problem in causal inference. Insight into the optimal policy value can guide the development of reward-maximizing, individualized treatment regimes. However, because the functional that defines the optimal value is non-differentiable, standard semi-parametric approaches for performing inference fail to be directly applicable. Existing approaches for handling this non-differentiability fall roughly into two camps. In one camp are estimators based on constructing smooth approximations of the optimal value. These approaches are computationally lightweight, but typically place unrealistic parametric assumptions on outcome regressions. In another camp are approaches that directly de-bias the non-smooth objective. These approaches don't place parametric assumptions on nuisance functions, but they either require the computation of intractably-many nuisance estimates, assume unrealistic $L^\infty$ nuisance convergence rates, or make strong margin assumptions that prohibit non-response to a treatment. In this paper, we revisit the problem of constructing smooth approximations of non-differentiable functionals. By carefully controlling first-order bias and second-order remainders, we show that a softmax smoothing-based estimator can be used to estimate parameters that are specified as a maximum of scores involving nuisance components. In particular, this includes the value of the optimal treatment policy as a special case. Our estimator obtains $\sqrt{n}$ convergence rates, avoids parametric restrictions/unrealistic margin assumptions, and is often statistically efficient.

Suggested Citation

  • Justin Whitehouse & Morgane Austern & Vasilis Syrgkanis, 2025. "Inference on Optimal Policy Values and Other Irregular Functionals via Smoothing," Papers 2507.11780, arXiv.org.
  • Handle: RePEc:arx:papers:2507.11780
    as

    Download full text from publisher

    File URL: http://arxiv.org/pdf/2507.11780
    File Function: Latest version
    Download Restriction: no
    ---><---

    References listed on IDEAS

    as
    1. Keisuke Hirano & Jack R. Porter, 2009. "Asymptotics for Statistical Treatment Rules," Econometrica, Econometric Society, vol. 77(5), pages 1683-1701, September.
    2. Victor Chernozhukov & Denis Chetverikov & Mert Demirer & Esther Duflo & Christian Hansen & Whitney Newey & James Robins, 2018. "Double/debiased machine learning for treatment and structural parameters," Econometrics Journal, Royal Economic Society, vol. 21(1), pages 1-68, February.
    3. Erica E. M. Moodie & Thomas S. Richardson & David A. Stephens, 2007. "Demystifying Optimal Dynamic Treatment Regimes," Biometrics, The International Biometric Society, vol. 63(2), pages 447-455, June.
    4. Chengchun Shi & Shikai Luo & Yuan Le & Hongtu Zhu & Rui Song, 2024. "Statistically Efficient Advantage Learning for Offline Reinforcement Learning in Infinite Horizons," Journal of the American Statistical Association, Taylor & Francis Journals, vol. 119(545), pages 232-245, January.
    5. Toru Kitagawa & Aleksey Tetenov, 2018. "Who Should Be Treated? Empirical Welfare Maximization Methods for Treatment Choice," Econometrica, Econometric Society, vol. 86(2), pages 591-616, March.
    6. Bibhas Chakraborty & Eric B. Laber & Yingqi Zhao, 2013. "Inference for Optimal Dynamic Treatment Regimes Using an Adaptive m-Out-of-n Bootstrap Scheme," Biometrics, The International Biometric Society, vol. 69(3), pages 714-723, September.
    7. Dylan J. Foster & Vasilis Syrgkanis, 2019. "Orthogonal Statistical Learning," Papers 1901.09036, arXiv.org, revised Jun 2023.
    8. S. A. Murphy, 2003. "Optimal dynamic treatment regimes," Journal of the Royal Statistical Society Series B, Royal Statistical Society, vol. 65(2), pages 331-355, May.
    9. Stefan Wager & Susan Athey, 2018. "Estimation and Inference of Heterogeneous Treatment Effects using Random Forests," Journal of the American Statistical Association, Taylor & Francis Journals, vol. 113(523), pages 1228-1242, July.
    10. Yizhe Xu & Tom H. Greene & Adam P. Bress & Brian C. Sauer & Brandon K. Bellows & Yue Zhang & William S. Weintraub & Andrew E. Moran & Jincheng Shen, 2022. "Estimating the optimal individualized treatment rule from a cost‐effectiveness perspective," Biometrics, The International Biometric Society, vol. 78(1), pages 337-351, March.
    11. Susan Athey & Stefan Wager, 2021. "Policy Learning With Observational Data," Econometrica, Econometric Society, vol. 89(1), pages 133-161, January.
    12. Charles F. Manski, 2004. "Statistical Treatment Rules for Heterogeneous Populations," Econometrica, Econometric Society, vol. 72(4), pages 1221-1246, July.
    13. Toru Kitagawa & Sokbae Lee & Chen Qiu, 2023. "Treatment choice, mean square regret and partial identification," The Japanese Economic Review, Springer, vol. 74(4), pages 573-602, October.
    14. Hongming Pu & Bo Zhang, 2021. "Estimating optimal treatment rules with an instrumental variable: A partial identification learning approach," Journal of the Royal Statistical Society Series B, Royal Statistical Society, vol. 83(2), pages 318-345, April.
    Full references (including those not matched with items on IDEAS)

    Most related items

    These are the items that most often cite the same works as this one and are cited by the same works as this one.
    1. Ayush Sawarni & Jikai Jin & Justin Whitehouse & Vasilis Syrgkanis, 2025. "Policy Learning with Abstention," Papers 2510.19672, arXiv.org, revised Nov 2025.
    2. Henrika Langen & Martin Huber, 2023. "How causal machine learning can leverage marketing strategies: Assessing and improving the performance of a coupon campaign," PLOS ONE, Public Library of Science, vol. 18(1), pages 1-37, January.
    3. Augustine Denteh & Helge Liebert, 2022. "Who Increases Emergency Department Use? New Insights from the Oregon Health Insurance Experiment," Papers 2201.07072, arXiv.org, revised Apr 2023.
    4. Achim Ahrens & Alessandra Stampi‐Bombelli & Selina Kurer & Dominik Hangartner, 2024. "Optimal multi‐action treatment allocation: A two‐phase field experiment to boost immigrant naturalization," Journal of Applied Econometrics, John Wiley & Sons, Ltd., vol. 39(7), pages 1379-1395, November.
    5. Davide Viviano, 2019. "Policy Targeting under Network Interference," Papers 1906.10258, arXiv.org, revised Apr 2024.
    6. Julia Hatamyar & Noemi Kreif, 2023. "Policy Learning with Rare Outcomes," Papers 2302.05260, arXiv.org, revised Oct 2023.
    7. Kyle Colangelo & Ying-Ying Lee, 2019. "Double debiased machine learning nonparametric inference with continuous treatments," CeMMAP working papers CWP72/19, Centre for Microdata Methods and Practice, Institute for Fiscal Studies.
    8. Michael C Knaus, 2022. "Double machine learning-based programme evaluation under unconfoundedness [Econometric methods for program evaluation]," The Econometrics Journal, Royal Economic Society, vol. 25(3), pages 602-627.
    9. Kyle Colangelo & Ying-Ying Lee, 2020. "Double Debiased Machine Learning Nonparametric Inference with Continuous Treatments," Papers 2004.03036, arXiv.org, revised Sep 2023.
    10. Shosei Sakaguchi, 2021. "Estimation of Optimal Dynamic Treatment Assignment Rules under Policy Constraints," Papers 2106.05031, arXiv.org, revised Aug 2024.
    11. Nora Bearth & Michael Lechner & Jana Mareckova & Fabian Muny, 2025. "Fairness-Aware and Interpretable Policy Learning," Papers 2509.12119, arXiv.org.
    12. Yu-Chang Chen & Haitian Xie, 2022. "Personalized Subsidy Rules," Papers 2202.13545, arXiv.org, revised Mar 2022.
    13. Goller, Daniel & Lechner, Michael & Pongratz, Tamara & Wolff, Joachim, 2025. "Active labor market policies for the long-term unemployed: New evidence from causal machine learning," Labour Economics, Elsevier, vol. 94(C).
    14. Takanori Ida & Takunori Ishihara & Koichiro Ito & Daido Kido & Toru Kitagawa & Shosei Sakaguchi & Shusaku Sasaki, 2022. "Choosing Who Chooses: Selection-Driven Targeting in Energy Rebate Programs," NBER Working Papers 30469, National Bureau of Economic Research, Inc.
    15. Takanori Ida & Takunori Ishihara & Koichiro Ito & Daido Kido & Toru Kitagawa & Shosei Sakaguchi & Shusaku Sasaki, 2021. "Paternalism, Autonomy, or Both? Experimental Evidence from Energy Saving Programs," Papers 2112.09850, arXiv.org.
    16. Patrick Rehill & Nicholas Biddle, 2025. "Policy Learning for Many Outcomes of Interest: Combining Optimal Policy Trees with Multi-objective Bayesian Optimisation," Computational Economics, Springer;Society for Computational Economics, vol. 66(2), pages 971-1001, August.
    17. Davide Viviano & Jess Rudder, 2020. "Policy design in experiments with unknown interference," Papers 2011.08174, arXiv.org, revised May 2024.
    18. Carlos Fernández-Loría & Foster Provost & Jesse Anderton & Benjamin Carterette & Praveen Chandar, 2023. "A Comparison of Methods for Treatment Assignment with an Application to Playlist Generation," Information Systems Research, INFORMS, vol. 34(2), pages 786-803, June.
    19. Emily Breza & Arun G. Chandrasekhar & Davide Viviano, 2025. "Generalizability with ignorance in mind: learning what we do (not) know for archetypes discovery," Papers 2501.13355, arXiv.org, revised Jul 2025.
    20. Cordier, J.; & Salvi, I.; & Steinbeck, V.; & Geissler, A.; & Vogel, J.;, 2023. "Is rapid recovery always the best recovery? - Developing a machine learning approach for optimal assignment rules under capacity constraints for knee replacement patients," Health, Econometrics and Data Group (HEDG) Working Papers 23/08, HEDG, c/o Department of Economics, University of York.

    More about this item

    NEP fields

    This paper has been announced in the following NEP Reports:

    Statistics

    Access and download statistics

    Corrections

    All material on this site has been provided by the respective publishers and authors. You can help correct errors and omissions. When requesting a correction, please mention this item's handle: RePEc:arx:papers:2507.11780. See general information about how to correct material in RePEc.

    If you have authored this item and are not yet registered with RePEc, we encourage you to do it here. This allows to link your profile to this item. It also allows you to accept potential citations to this item that we are uncertain about.

    If CitEc recognized a bibliographic reference but did not link an item in RePEc to it, you can help with this form .

    If you know of missing items citing this one, you can help us creating those links by adding the relevant references in the same way as above, for each refering item. If you are a registered author of this item, you may also want to check the "citations" tab in your RePEc Author Service profile, as there may be some citations waiting for confirmation.

    For technical questions regarding this item, or to correct its authors, title, abstract, bibliographic or download information, contact: arXiv administrators (email available below). General contact details of provider: http://arxiv.org/ .

    Please note that corrections may take a couple of weeks to filter through the various RePEc services.

    IDEAS is a RePEc service. RePEc uses bibliographic data supplied by the respective publishers.