IDEAS home Printed from https://ideas.repec.org/p/arx/papers/1905.10116.html

Semi-Parametric Efficient Policy Learning with Continuous Actions

Author

Listed:
  • Mert Demirer
  • Vasilis Syrgkanis
  • Greg Lewis
  • Victor Chernozhukov

Abstract

We consider off-policy evaluation and optimization with continuous action spaces. We focus on observational data where the data collection policy is unknown and needs to be estimated. We take a semi-parametric approach where the value function takes a known parametric form in the treatment, but we are agnostic on how it depends on the observed contexts. We propose a doubly robust off-policy estimate for this setting and show that off-policy optimization based on this estimate is robust to estimation errors of the policy function or the regression model. Our results also apply if the model does not satisfy our semi-parametric form, but rather we measure regret in terms of the best projection of the true value function to this functional space. Our work extends prior approaches of policy optimization from observational data that only considered discrete actions. We provide an experimental evaluation of our method in a synthetic data example motivated by optimal personalized pricing and costly resource allocation.

Suggested Citation

  • Mert Demirer & Vasilis Syrgkanis & Greg Lewis & Victor Chernozhukov, 2019. "Semi-Parametric Efficient Policy Learning with Continuous Actions," Papers 1905.10116, arXiv.org, revised Jul 2019.
  • Handle: RePEc:arx:papers:1905.10116
    as

    Download full text from publisher

    File URL: http://arxiv.org/pdf/1905.10116
    File Function: Latest version
    Download Restriction: no
    ---><---

    Other versions of this item:

    Citations

    Citations are extracted by the CitEc Project, subscribe to its RSS feed for this item.
    as


    Cited by:

    1. Kyle Colangelo & Ying-Ying Lee, 2019. "Double debiased machine learning nonparametric inference with continuous treatments," CeMMAP working papers CWP54/19, Centre for Microdata Methods and Practice, Institute for Fiscal Studies.
    2. Kyle Colangelo & Ying-Ying Lee, 2020. "Double Debiased Machine Learning Nonparametric Inference with Continuous Treatments," Papers 2004.03036, arXiv.org, revised Sep 2023.
    3. Kyle Colangelo & Ying-Ying Lee, 2019. "Double debiased machine learning nonparametric inference with continuous treatments," CeMMAP working papers CWP72/19, Centre for Microdata Methods and Practice, Institute for Fiscal Studies.
    4. Yi Zhang & Kosuke Imai, 2023. "Individualized Policy Evaluation and Learning under Clustered Network Interference," Papers 2311.02467, arXiv.org, revised Mar 2025.
    5. Masahiro Kato & Masatoshi Uehara & Shota Yasui, 2020. "Off-Policy Evaluation and Learning for External Validity under a Covariate Shift," Papers 2002.11642, arXiv.org, revised Oct 2020.
    6. Masahiro Kato, 2020. "Confidence Interval for Off-Policy Evaluation from Dependent Samples via Bandit Algorithm: Approach from Standardized Martingales," Papers 2006.06982, arXiv.org.
    7. Rahul Singh & Liyuan Xu & Arthur Gretton, 2020. "Kernel Methods for Causal Functions: Dose, Heterogeneous, and Incremental Response Curves," Papers 2010.04855, arXiv.org, revised Oct 2022.
    8. Ruohan Zhan & Shichao Han & Yuchen Hu & Zhenling Jiang, 2024. "Estimating Treatment Effects under Recommender Interference: A Structured Neural Networks Approach," Papers 2406.14380, arXiv.org, revised Oct 2025.
    9. Chunrong Ai & Yue Fang & Haitian Xie, 2024. "Data-Driven Policy Learning for Continuous Treatments," Papers 2402.02535, arXiv.org, revised Dec 2025.
    10. AmirEmad Ghassami & Andrew Ying & Ilya Shpitser & Eric Tchetgen Tchetgen, 2021. "Minimax Kernel Machine Learning for a Class of Doubly Robust Functionals with Application to Proximal Causal Inference," Papers 2104.02929, arXiv.org, revised Mar 2022.
    11. Hirano, Keisuke & Porter, Jack R., 2020. "Asymptotic analysis of statistical decision rules in econometrics," Handbook of Econometrics, in: Steven N. Durlauf & Lars Peter Hansen & James J. Heckman & Rosa L. Matzkin (ed.), Handbook of Econometrics, edition 1, volume 7, chapter 0, pages 283-354, Elsevier.
    12. Cai, Hengrui & Shi, Chengchun & Song, Rui & Lu, Wenbin, 2023. "Jump interval-learning for individualized decision making with continuous treatments," LSE Research Online Documents on Economics 118231, London School of Economics and Political Science, LSE Library.
    13. Max H. Farrell & Tengyuan Liang & Sanjog Misra, 2020. "Deep Learning for Individual Heterogeneity," Papers 2010.14694, arXiv.org, revised Apr 2025.
    14. Andrew Bennett & Nathan Kallus, 2020. "Efficient Policy Learning from Surrogate-Loss Classification Reductions," Papers 2002.05153, arXiv.org.
    15. Ruohan Zhan & Zhimei Ren & Susan Athey & Zhengyuan Zhou, 2024. "Policy Learning with Adaptively Collected Data," Management Science, INFORMS, vol. 70(8), pages 5270-5297, August.

    More about this item

    Statistics

    Access and download statistics

    Corrections

    All material on this site has been provided by the respective publishers and authors. You can help correct errors and omissions. When requesting a correction, please mention this item's handle: RePEc:arx:papers:1905.10116. See general information about how to correct material in RePEc.

    If you have authored this item and are not yet registered with RePEc, we encourage you to do it here. This allows to link your profile to this item. It also allows you to accept potential citations to this item that we are uncertain about.

    We have no bibliographic references for this item. You can help adding them by using this form .

    If you know of missing items citing this one, you can help us creating those links by adding the relevant references in the same way as above, for each refering item. If you are a registered author of this item, you may also want to check the "citations" tab in your RePEc Author Service profile, as there may be some citations waiting for confirmation.

    For technical questions regarding this item, or to correct its authors, title, abstract, bibliographic or download information, contact: arXiv administrators (email available below). General contact details of provider: http://arxiv.org/ .

    Please note that corrections may take a couple of weeks to filter through the various RePEc services.

    IDEAS is a RePEc service. RePEc uses bibliographic data supplied by the respective publishers.