IDEAS home Printed from https://ideas.repec.org/p/arx/papers/2310.16945.html
   My bibliography  Save this paper

Causal Q-Aggregation for CATE Model Selection

Author

Listed:
  • Hui Lan
  • Vasilis Syrgkanis

Abstract

Accurate estimation of conditional average treatment effects (CATE) is at the core of personalized decision making. While there is a plethora of models for CATE estimation, model selection is a nontrivial task, due to the fundamental problem of causal inference. Recent empirical work provides evidence in favor of proxy loss metrics with double robust properties and in favor of model ensembling. However, theoretical understanding is lacking. Direct application of prior theoretical work leads to suboptimal oracle model selection rates due to the non-convexity of the model selection problem. We provide regret rates for the major existing CATE ensembling approaches and propose a new CATE model ensembling approach based on Q-aggregation using the doubly robust loss. Our main result shows that causal Q-aggregation achieves statistically optimal oracle model selection regret rates of $\frac{\log(M)}{n}$ (with $M$ models and $n$ samples), with the addition of higher-order estimation error terms related to products of errors in the nuisance functions. Crucially, our regret rate does not require that any of the candidate CATE models be close to the truth. We validate our new method on many semi-synthetic datasets and also provide extensions of our work to CATE model selection with instrumental variables and unobserved confounding.

Suggested Citation

  • Hui Lan & Vasilis Syrgkanis, 2023. "Causal Q-Aggregation for CATE Model Selection," Papers 2310.16945, arXiv.org, revised Nov 2023.
  • Handle: RePEc:arx:papers:2310.16945
    as

    Download full text from publisher

    File URL: http://arxiv.org/pdf/2310.16945
    File Function: Latest version
    Download Restriction: no
    ---><---

    References listed on IDEAS

    as
    1. Victor Chernozhukov & Denis Chetverikov & Mert Demirer & Esther Duflo & Christian Hansen & Whitney Newey & James Robins, 2018. "Double/debiased machine learning for treatment and structural parameters," Econometrics Journal, Royal Economic Society, vol. 21(1), pages 1-68, February.
    2. Grimmer, Justin & Messing, Solomon & Westwood, Sean J., 2017. "Estimating Heterogeneous Treatment Effects and the Effects of Heterogeneous Treatments with Ensemble Methods," Political Analysis, Cambridge University Press, vol. 25(4), pages 413-434, October.
    3. Dylan J. Foster & Vasilis Syrgkanis, 2019. "Orthogonal Statistical Learning," Papers 1901.09036, arXiv.org, revised Jun 2023.
    4. Stefan Wager & Susan Athey, 2018. "Estimation and Inference of Heterogeneous Treatment Effects using Random Forests," Journal of the American Statistical Association, Taylor & Francis Journals, vol. 113(523), pages 1228-1242, July.
    5. Raaz Dwivedi & Yan Shuo Tan & Briton Park & Mian Wei & Kevin Horgan & David Madigan & Bin Yu, 2020. "Stable Discovery of Interpretable Subgroups via Calibration in Causal Studies," International Statistical Review, International Statistical Institute, vol. 88(S1), pages 135-178, December.
    6. Craig A. Rolling & Yuhong Yang, 2014. "Model selection for estimating treatment effects," Journal of the Royal Statistical Society Series B, Royal Statistical Society, vol. 76(4), pages 749-769, September.
    7. X Nie & S Wager, 2021. "Quasi-oracle estimation of heterogeneous treatment effects [TensorFlow: A system for large-scale machine learning]," Biometrika, Biometrika Trust, vol. 108(2), pages 299-319.
    8. Poterba, James M. & Venti, Steven F. & Wise, David A., 1995. "Do 401(k) contributions crowd out other personal saving?," Journal of Public Economics, Elsevier, vol. 58(1), pages 1-32, September.
    9. McAlinn, Kenichiro & West, Mike, 2019. "Dynamic Bayesian predictive synthesis in time series forecasting," Journal of Econometrics, Elsevier, vol. 210(1), pages 155-169.
    10. Vira Semenova & Victor Chernozhukov, 2021. "Debiased machine learning of conditional average treatment effects and other causal functions," The Econometrics Journal, Royal Economic Society, vol. 24(2), pages 264-289.
    Full references (including those not matched with items on IDEAS)

    Citations

    Citations are extracted by the CitEc Project, subscribe to its RSS feed for this item.
    as


    Cited by:

    1. Masahiro Kato, 2024. "Triple/Debiased Lasso for Statistical Inference of Conditional Average Treatment Effects," Papers 2403.03240, arXiv.org.

    Most related items

    These are the items that most often cite the same works as this one and are cited by the same works as this one.
    1. Michael Lechner & Jana Mareckova, 2024. "Comprehensive Causal Machine Learning," Papers 2405.10198, arXiv.org.
    2. Daniel Goller, 2023. "Analysing a built-in advantage in asymmetric darts contests using causal machine learning," Annals of Operations Research, Springer, vol. 325(1), pages 649-679, June.
    3. David M. Ritzwoller & Vasilis Syrgkanis, 2024. "Simultaneous Inference for Local Structural Parameters with Random Forests," Papers 2405.07860, arXiv.org, revised Sep 2024.
    4. Phillip Heiler & Michael C. Knaus, 2021. "Effect or Treatment Heterogeneity? Policy Evaluation with Aggregated and Disaggregated Treatments," Papers 2110.01427, arXiv.org, revised Aug 2023.
    5. Ganesh Karapakula, 2023. "Stable Probability Weighting: Large-Sample and Finite-Sample Estimation and Inference Methods for Heterogeneous Causal Effects of Multivalued Treatments Under Limited Overlap," Papers 2301.05703, arXiv.org, revised Jan 2023.
    6. Retsef Levi & Elisabeth Paulson & Georgia Perakis & Emily Zhang, 2024. "Heterogeneous Treatment Effects in Panel Data," Papers 2406.05633, arXiv.org.
    7. Nora Bearth & Michael Lechner, 2024. "Causal Machine Learning for Moderation Effects," Papers 2401.08290, arXiv.org, revised Apr 2024.
    8. Kazuhiko Shinoda & Takahiro Hoshino, 2022. "Orthogonal Series Estimation for the Ratio of Conditional Expectation Functions," Papers 2212.13145, arXiv.org.
    9. Paul B. Ellickson & Wreetabrata Kar & James C. Reeder, 2023. "Estimating Marketing Component Effects: Double Machine Learning from Targeted Digital Promotions," Marketing Science, INFORMS, vol. 42(4), pages 704-728, July.
    10. Michael C Knaus & Michael Lechner & Anthony Strittmatter, 2021. "Machine learning estimation of heterogeneous causal effects: Empirical Monte Carlo evidence," The Econometrics Journal, Royal Economic Society, vol. 24(1), pages 134-161.
    11. Bokelmann, Björn & Lessmann, Stefan, 2024. "Improving uplift model evaluation on randomized controlled trial data," European Journal of Operational Research, Elsevier, vol. 313(2), pages 691-707.
    12. Yiyi Huo & Yingying Fan & Fang Han, 2023. "On the adaptation of causal forests to manifold data," Papers 2311.16486, arXiv.org, revised Dec 2023.
    13. Patrick Rehill & Nicholas Biddle, 2024. "Heterogeneous treatment effect estimation with high-dimensional data in public policy evaluation -- an application to the conditioning of cash transfers in Morocco using causal machine learning," Papers 2401.07075, arXiv.org, revised Mar 2024.
    14. Heejun Shin & Joseph Antonelli, 2023. "Improved inference for doubly robust estimators of heterogeneous treatment effects," Biometrics, The International Biometric Society, vol. 79(4), pages 3140-3152, December.
    15. Phillip Heiler, 2022. "Heterogeneous Treatment Effect Bounds under Sample Selection with an Application to the Effects of Social Media on Political Polarization," Papers 2209.04329, arXiv.org, revised Jul 2024.
    16. Huber, Martin & Meier, Jonas & Wallimann, Hannes, 2022. "Business analytics meets artificial intelligence: Assessing the demand effects of discounts on Swiss train tickets," Transportation Research Part B: Methodological, Elsevier, vol. 163(C), pages 22-39.
    17. Henrika Langen & Martin Huber, 2022. "How causal machine learning can leverage marketing strategies: Assessing and improving the performance of a coupon campaign," Papers 2204.10820, arXiv.org, revised Jun 2022.
    18. Nan Liu & Yanbo Liu & Yuya Sasaki, 2024. "Estimation and Inference for Causal Functions with Multiway Clustered Data," Papers 2409.06654, arXiv.org.
    19. Martin Huber & Jannis Kueck, 2022. "Testing the identification of causal effects in observational data," Papers 2203.15890, arXiv.org, revised Jun 2023.
    20. Hua Chen & Jianing Xing & Xiaoxu Yang & Kai Zhan, 2021. "Heterogeneous Effects of Health Insurance on Rural Children’s Health in China: A Causal Machine Learning Approach," IJERPH, MDPI, vol. 18(18), pages 1-14, September.

    More about this item

    NEP fields

    This paper has been announced in the following NEP Reports:

    Statistics

    Access and download statistics

    Corrections

    All material on this site has been provided by the respective publishers and authors. You can help correct errors and omissions. When requesting a correction, please mention this item's handle: RePEc:arx:papers:2310.16945. See general information about how to correct material in RePEc.

    If you have authored this item and are not yet registered with RePEc, we encourage you to do it here. This allows to link your profile to this item. It also allows you to accept potential citations to this item that we are uncertain about.

    If CitEc recognized a bibliographic reference but did not link an item in RePEc to it, you can help with this form .

    If you know of missing items citing this one, you can help us creating those links by adding the relevant references in the same way as above, for each refering item. If you are a registered author of this item, you may also want to check the "citations" tab in your RePEc Author Service profile, as there may be some citations waiting for confirmation.

    For technical questions regarding this item, or to correct its authors, title, abstract, bibliographic or download information, contact: arXiv administrators (email available below). General contact details of provider: http://arxiv.org/ .

    Please note that corrections may take a couple of weeks to filter through the various RePEc services.

    IDEAS is a RePEc service. RePEc uses bibliographic data supplied by the respective publishers.