IDEAS home Printed from https://ideas.repec.org/p/arx/papers/1810.04778.html

Offline Multi-Action Policy Learning: Generalization and Optimization

Author

Listed:
  • Zhengyuan Zhou
  • Susan Athey
  • Stefan Wager

Abstract

In many settings, a decision-maker wishes to learn a rule, or policy, that maps from observable characteristics of an individual to an action. Examples include selecting offers, prices, advertisements, or emails to send to consumers, as well as the problem of determining which medication to prescribe to a patient. While there is a growing body of literature devoted to this problem, most existing results are focused on the case where data comes from a randomized experiment, and further, there are only two possible actions, such as giving a drug to a patient or not. In this paper, we study the offline multi-action policy learning problem with observational data and where the policy may need to respect budget constraints or belong to a restricted policy class such as decision trees. We build on the theory of efficient semi-parametric inference in order to propose and implement a policy learning algorithm that achieves asymptotically minimax-optimal regret. To the best of our knowledge, this is the first result of this type in the multi-action setup, and it provides a substantial performance improvement over the existing learning algorithms. We then consider additional computational challenges that arise in implementing our method for the case where the policy is restricted to take the form of a decision tree. We propose two different approaches, one using a mixed integer program formulation and the other using a tree-search based algorithm.

Suggested Citation

  • Zhengyuan Zhou & Susan Athey & Stefan Wager, 2018. "Offline Multi-Action Policy Learning: Generalization and Optimization," Papers 1810.04778, arXiv.org, revised Nov 2018.
  • Handle: RePEc:arx:papers:1810.04778
    as

    Download full text from publisher

    File URL: http://arxiv.org/pdf/1810.04778
    File Function: Latest version
    Download Restriction: no
    ---><---

    Other versions of this item:

    References listed on IDEAS

    as
    1. Yingqi Zhao & Donglin Zeng & A. John Rush & Michael R. Kosorok, 2012. "Estimating Individualized Treatment Rules Using Outcome Weighted Learning," Journal of the American Statistical Association, Taylor & Francis Journals, vol. 107(499), pages 1106-1118, September.
    2. Stoye, Jörg, 2009. "Minimax regret treatment choice with finite samples," Journal of Econometrics, Elsevier, vol. 151(1), pages 70-81, July.
    3. Victor Chernozhukov & Denis Chetverikov & Mert Demirer & Esther Duflo & Christian Hansen & Whitney K. Newey, 2016. "Double machine learning for treatment and causal parameters," CeMMAP working papers 49/16, Institute for Fiscal Studies.
    4. Newey, Whitney K, 1994. "The Asymptotic Variance of Semiparametric Estimators," Econometrica, Econometric Society, vol. 62(6), pages 1349-1382, November.
    5. Alexandre Belloni & Victor Chernozhukov & Christian Hansen, 2014. "Inference on Treatment Effects after Selection among High-Dimensional Controlsâ€," The Review of Economic Studies, Review of Economic Studies Ltd, vol. 81(2), pages 608-650.
    6. Toru Kitagawa & Aleksey Tetenov, 2018. "Who Should Be Treated? Empirical Welfare Maximization Methods for Treatment Choice," Econometrica, Econometric Society, vol. 86(2), pages 591-616, March.
    7. Victor Chernozhukov & Juan Carlos Escanciano & Hidehiko Ichimura & Whitney K. Newey & James M. Robins, 2022. "Locally Robust Semiparametric Estimation," Econometrica, Econometric Society, vol. 90(4), pages 1501-1535, July.
    8. Xin Zhou & Nicole Mayer-Hamblett & Umer Khan & Michael R. Kosorok, 2017. "Residual Weighted Learning for Estimating Individualized Treatment Rules," Journal of the American Statistical Association, Taylor & Francis Journals, vol. 112(517), pages 169-187, January.
    9. Guido W. Imbens, 2004. "Nonparametric Estimation of Average Treatment Effects Under Exogeneity: A Review," The Review of Economics and Statistics, MIT Press, vol. 86(1), pages 4-29, February.
    10. Jinyong Hahn, 1998. "On the Role of the Propensity Score in Efficient Semiparametric Estimation of Average Treatment Effects," Econometrica, Econometric Society, vol. 66(2), pages 315-332, March.
    11. Athey, Susan & Wager, Stefan, 2017. "Efficient Policy Learning," Research Papers 3506, Stanford University, Graduate School of Business.
    12. James Heckman, 2013. "Sample selection bias as a specification error," Applied Econometrics, Russian Presidential Academy of National Economy and Public Administration (RANEPA), vol. 31(3), pages 129-137.
    13. Imbens,Guido W. & Rubin,Donald B., 2015. "Causal Inference for Statistics, Social, and Biomedical Sciences," Cambridge Books, Cambridge University Press, number 9780521885881, Enero-Abr.
    14. Victor Chernozhukov & Denis Chetverikov & Mert Demirer & Esther Duflo & Christian Hansen & Whitney Newey & James Robins, 2018. "Double/debiased machine learning for treatment and structural parameters," Econometrics Journal, Royal Economic Society, vol. 21(1), pages 1-68, February.
    15. Carri W. Chan & Vivek F. Farias & Nicholas Bambos & Gabriel J. Escobar, 2012. "Optimizing Intensive Care Unit Discharge Decisions with Patient Readmissions," Operations Research, INFORMS, vol. 60(6), pages 1323-1341, December.
    16. A. Belloni & V. Chernozhukov & I. Fernández‐Val & C. Hansen, 2017. "Program Evaluation and Causal Inference With High‐Dimensional Data," Econometrica, Econometric Society, vol. 85, pages 233-298, January.
    17. Daniel Russo & Benjamin Van Roy, 2014. "Learning to Optimize via Posterior Sampling," Mathematics of Operations Research, INFORMS, vol. 39(4), pages 1221-1243, November.
    18. Keisuke Hirano & Jack R. Porter, 2009. "Asymptotics for Statistical Treatment Rules," Econometrica, Econometric Society, vol. 77(5), pages 1683-1701, September.
    19. Farrell, Max H., 2015. "Robust inference on average treatment effects with possibly more covariates than observations," Journal of Econometrics, Elsevier, vol. 189(1), pages 1-23.
    20. Charles F. Manski, 2004. "Statistical Treatment Rules for Heterogeneous Populations," Econometrica, Econometric Society, vol. 72(4), pages 1221-1246, July.
    Full references (including those not matched with items on IDEAS)

    Most related items

    These are the items that most often cite the same works as this one and are cited by the same works as this one.
    1. Susan Athey & Stefan Wager, 2021. "Policy Learning With Observational Data," Econometrica, Econometric Society, vol. 89(1), pages 133-161, January.
    2. Kyle Colangelo & Ying-Ying Lee, 2019. "Double debiased machine learning nonparametric inference with continuous treatments," CeMMAP working papers CWP72/19, Centre for Microdata Methods and Practice, Institute for Fiscal Studies.
    3. Kyle Colangelo & Ying-Ying Lee, 2020. "Double Debiased Machine Learning Nonparametric Inference with Continuous Treatments," Papers 2004.03036, arXiv.org, revised Sep 2023.
    4. Davide Viviano & Jelena Bradic, 2020. "Fair Policy Targeting," Papers 2005.12395, arXiv.org, revised Jun 2022.
    5. Michael C Knaus, 2022. "Double machine learning-based programme evaluation under unconfoundedness [Econometric methods for program evaluation]," The Econometrics Journal, Royal Economic Society, vol. 25(3), pages 602-627.
    6. Huber, Martin, 2019. "An introduction to flexible methods for policy evaluation," FSES Working Papers 504, Faculty of Economics and Social Sciences, University of Freiburg/Fribourg Switzerland.
    7. Ganesh Karapakula, 2023. "Stable Probability Weighting: Large-Sample and Finite-Sample Estimation and Inference Methods for Heterogeneous Causal Effects of Multivalued Treatments Under Limited Overlap," Papers 2301.05703, arXiv.org, revised Jan 2023.
    8. Davide Viviano, 2019. "Policy Targeting under Network Interference," Papers 1906.10258, arXiv.org, revised Apr 2024.
    9. Achim Ahrens & Alessandra Stampi‐Bombelli & Selina Kurer & Dominik Hangartner, 2024. "Optimal multi‐action treatment allocation: A two‐phase field experiment to boost immigrant naturalization," Journal of Applied Econometrics, John Wiley & Sons, Ltd., vol. 39(7), pages 1379-1395, November.
    10. Kyle Colangelo & Ying-Ying Lee, 2019. "Double debiased machine learning nonparametric inference with continuous treatments," CeMMAP working papers CWP54/19, Centre for Microdata Methods and Practice, Institute for Fiscal Studies.
    11. Agboola, Oluwagbenga David & Yu, Han, 2023. "Neighborhood-based cross fitting approach to treatment effects with high-dimensional data," Computational Statistics & Data Analysis, Elsevier, vol. 186(C).
    12. Michael Pollmann, 2020. "Causal Inference for Spatial Treatments," Papers 2011.00373, arXiv.org, revised Apr 2026.
    13. Mert Demirer & Vasilis Syrgkanis & Greg Lewis & Victor Chernozhukov, 2019. "Semi-Parametric Efficient Policy Learning with Continuous Actions," Papers 1905.10116, arXiv.org, revised Jul 2019.
    14. Nan Liu & Yanbo Liu & Yuya Sasaki & Yuanyuan Wan, 2025. "Nonparametric Uniform Inference in Binary Classification and Policy Values," Working Papers tecipa-811, University of Toronto, Department of Economics.
    15. Rahul Singh & Liyuan Xu & Arthur Gretton, 2020. "Kernel Methods for Causal Functions: Dose, Heterogeneous, and Incremental Response Curves," Papers 2010.04855, arXiv.org, revised Oct 2022.
    16. Simon Calmar Andersen & Louise Beuchert & Phillip Heiler & Helena Skyt Nielsen, 2023. "A Guide to Impact Evaluation under Sample Selection and Missing Data: Teacher's Aides and Adolescent Mental Health," Papers 2308.04963, arXiv.org.
    17. Alejandro Sanchez-Becerra, 2023. "Robust inference for the treatment effect variance in experiments using machine learning," Papers 2306.03363, arXiv.org.
    18. Karun Adusumilli & Friedrich Geiecke & Claudio Schilter, 2019. "Dynamically Optimal Treatment Allocation," Papers 1904.01047, arXiv.org, revised Nov 2024.
    19. Vira Semenova, 2020. "Generalized Lee Bounds," Papers 2008.12720, arXiv.org, revised May 2025.
    20. Matias D Cattaneo & Michael Jansson & Xinwei Ma, 2019. "Two-Step Estimation and Inference with Possibly Many Included Covariates," The Review of Economic Studies, Review of Economic Studies Ltd, vol. 86(3), pages 1095-1122.

    More about this item

    NEP fields

    This paper has been announced in the following NEP Reports:

    Statistics

    Access and download statistics

    Corrections

    All material on this site has been provided by the respective publishers and authors. You can help correct errors and omissions. When requesting a correction, please mention this item's handle: RePEc:arx:papers:1810.04778. See general information about how to correct material in RePEc.

    If you have authored this item and are not yet registered with RePEc, we encourage you to do it here. This allows to link your profile to this item. It also allows you to accept potential citations to this item that we are uncertain about.

    If CitEc recognized a bibliographic reference but did not link an item in RePEc to it, you can help with this form .

    If you know of missing items citing this one, you can help us creating those links by adding the relevant references in the same way as above, for each refering item. If you are a registered author of this item, you may also want to check the "citations" tab in your RePEc Author Service profile, as there may be some citations waiting for confirmation.

    For technical questions regarding this item, or to correct its authors, title, abstract, bibliographic or download information, contact: arXiv administrators (email available below). General contact details of provider: http://arxiv.org/ .

    Please note that corrections may take a couple of weeks to filter through the various RePEc services.

    IDEAS is a RePEc service. RePEc uses bibliographic data supplied by the respective publishers.