Causal Q-Aggregation for CATE Model Selection

Causal Q-Aggregation for CATE Model Selection

Author

Listed:

Hui Lan
Vasilis Syrgkanis

Abstract

Accurate estimation of conditional average treatment effects (CATE) is at the core of personalized decision making. While there is a plethora of models for CATE estimation, model selection is a nontrivial task, due to the fundamental problem of causal inference. Recent empirical work provides evidence in favor of proxy loss metrics with double robust properties and in favor of model ensembling. However, theoretical understanding is lacking. Direct application of prior theoretical work leads to suboptimal oracle model selection rates due to the non-convexity of the model selection problem. We provide regret rates for the major existing CATE ensembling approaches and propose a new CATE model ensembling approach based on Q-aggregation using the doubly robust loss. Our main result shows that causal Q-aggregation achieves statistically optimal oracle model selection regret rates of $\frac{\log(M)}{n}$ (with $M$ models and $n$ samples), with the addition of higher-order estimation error terms related to products of errors in the nuisance functions. Crucially, our regret rate does not require that any of the candidate CATE models be close to the truth. We validate our new method on many semi-synthetic datasets and also provide extensions of our work to CATE model selection with instrumental variables and unobserved confounding.

Suggested Citation

Hui Lan & Vasilis Syrgkanis, 2023. "Causal Q-Aggregation for CATE Model Selection," Papers 2310.16945, arXiv.org, revised Apr 2025.

Handle: RePEc:arx:papers:2310.16945

Download full text from publisher

References listed on IDEAS

Stefan Wager & Susan Athey, 2018. "Estimation and Inference of Heterogeneous Treatment Effects using Random Forests," Journal of the American Statistical Association, Taylor & Francis Journals, vol. 113(523), pages 1228-1242, July.
- Wager, Stefan & Athey, Susan, 2017. "Estimation and Inference of Heterogeneous Treatment Effects Using Random Forests," Research Papers 3576, Stanford University, Graduate School of Business.
Grimmer, Justin & Messing, Solomon & Westwood, Sean J., 2017. "Estimating Heterogeneous Treatment Effects and the Effects of Heterogeneous Treatments with Ensemble Methods," Political Analysis, Cambridge University Press, vol. 25(4), pages 413-434, October.
Farbmacher, Helmut & Kögel, Heinrich & Spindler, Martin, 2021. "Heterogeneous effects of poverty on attention," Labour Economics, Elsevier, vol. 71(C).
Dylan J. Foster & Vasilis Syrgkanis, 2019. "Orthogonal Statistical Learning," Papers 1901.09036, arXiv.org, revised Jun 2023.
Victor Chernozhukov & Denis Chetverikov & Mert Demirer & Esther Duflo & Christian Hansen & Whitney Newey & James Robins, 2018. "Double/debiased machine learning for treatment and structural parameters," Econometrics Journal, Royal Economic Society, vol. 21(1), pages 1-68, February.
- Victor Chernozhukov & Denis Chetverikov & Mert Demirer & Esther Duflo & Christian Hansen & Whitney K. Newey & James Robins, 2017. "Double/debiased machine learning for treatment and structural parameters," CeMMAP working papers CWP28/17, Centre for Microdata Methods and Practice, Institute for Fiscal Studies.
- Victor Chernozhukov & Denis Chetverikov & Mert Demirer & Esther Duflo & Christian Hansen & Whitney K. Newey & James Robins, 2017. "Double/debiased machine learning for treatment and structural parameters," CeMMAP working papers 28/17, Institute for Fiscal Studies.
- Victor Chernozhukov & Denis Chetverikov & Mert Demirer & Esther Duflo & Christian Hansen & Whitney Newey & James Robins, 2017. "Double/Debiased Machine Learning for Treatment and Structural Parameters," NBER Working Papers 23564, National Bureau of Economic Research, Inc.
Poterba, James M. & Venti, Steven F. & Wise, David A., 1995. "Do 401(k) contributions crowd out other personal saving?," Journal of Public Economics, Elsevier, vol. 58(1), pages 1-32, September.
- James M. Poterba & Steven F. Venti & David A. Wise, 1993. "Do 401(k) Contributions Crowd Out Other Persoanl Saving?," NBER Working Papers 4391, National Bureau of Economic Research, Inc.
Raaz Dwivedi & Yan Shuo Tan & Briton Park & Mian Wei & Kevin Horgan & David Madigan & Bin Yu, 2020. "Stable Discovery of Interpretable Subgroups via Calibration in Causal Studies," International Statistical Review, International Statistical Institute, vol. 88(S1), pages 135-178, December.
Craig A. Rolling & Yuhong Yang, 2014. "Model selection for estimating treatment effects," Journal of the Royal Statistical Society Series B, Royal Statistical Society, vol. 76(4), pages 749-769, September.
X Nie & S Wager, 2021. "Quasi-oracle estimation of heterogeneous treatment effects [TensorFlow: A system for large-scale machine learning]," Biometrika, Biometrika Trust, vol. 108(2), pages 299-319.
Vira Semenova & Victor Chernozhukov, 2021. "Debiased machine learning of conditional average treatment effects and other causal functions," The Econometrics Journal, Royal Economic Society, vol. 24(2), pages 264-289.
McAlinn, Kenichiro & West, Mike, 2019. "Dynamic Bayesian predictive synthesis in time series forecasting," Journal of Econometrics, Elsevier, vol. 210(1), pages 155-169.

Full references (including those not matched with items on IDEAS)

Citations

Citations are extracted by the CitEc Project, subscribe to its RSS feed for this item.

Cited by:

Masahiro Kato, 2024. "Triple/Debiased Lasso for Statistical Inference of Conditional Average Treatment Effects," Papers 2403.03240, arXiv.org.

Most related items

These are the items that most often cite the same works as this one and are cited by the same works as this one.

Michael Lechner & Jana Mareckova, 2024. "Comprehensive Causal Machine Learning," Papers 2405.10198, arXiv.org, revised Feb 2025.
Daniel Goller, 2023. "Analysing a built-in advantage in asymmetric darts contests using causal machine learning," Annals of Operations Research, Springer, vol. 325(1), pages 649-679, June.
- Daniel Goller, 2020. "Analysing a built-in advantage in asymmetric darts contests using causal machine learning," Papers 2008.07165, arXiv.org.
- Goller, Daniel, 2020. "Analysing a built-in advantage in asymmetric darts contests using causal machine learning," Economics Working Paper Series 2013, University of St. Gallen, School of Economics and Political Science.
David M. Ritzwoller & Vasilis Syrgkanis, 2024. "Simultaneous Inference for Local Structural Parameters with Random Forests," Papers 2405.07860, arXiv.org, revised Sep 2024.
Phillip Heiler & Michael C. Knaus, 2021. "Effect or Treatment Heterogeneity? Policy Evaluation with Aggregated and Disaggregated Treatments," Papers 2110.01427, arXiv.org, revised Aug 2023.
- Heiler, Phillip & Knaus, Michael C., 2022. "Effect or Treatment Heterogeneity? Policy Evaluation with Aggregated and Disaggregated Treatments," IZA Discussion Papers 15580, Institute of Labor Economics (IZA).
Ganesh Karapakula, 2023. "Stable Probability Weighting: Large-Sample and Finite-Sample Estimation and Inference Methods for Heterogeneous Causal Effects of Multivalued Treatments Under Limited Overlap," Papers 2301.05703, arXiv.org, revised Jan 2023.
Retsef Levi & Elisabeth Paulson & Georgia Perakis & Emily Zhang, 2024. "Heterogeneous Treatment Effects in Panel Data," Papers 2406.05633, arXiv.org.
Nora Bearth & Michael Lechner, 2024. "Causal Machine Learning for Moderation Effects," Papers 2401.08290, arXiv.org, revised Jan 2025.
Kazuhiko Shinoda & Takahiro Hoshino, 2022. "Orthogonal Series Estimation for the Ratio of Conditional Expectation Functions," Papers 2212.13145, arXiv.org.
Paul B. Ellickson & Wreetabrata Kar & James C. Reeder, 2023. "Estimating Marketing Component Effects: Double Machine Learning from Targeted Digital Promotions," Marketing Science, INFORMS, vol. 42(4), pages 704-728, July.
Justin Whitehouse & Morgane Austern & Vasilis Syrgkanis, 2025. "Inference on Optimal Policy Values and Other Irregular Functionals via Smoothing," Papers 2507.11780, arXiv.org.
Bokelmann, Björn & Lessmann, Stefan, 2024. "Improving uplift model evaluation on randomized controlled trial data," European Journal of Operational Research, Elsevier, vol. 313(2), pages 691-707.
Wu, Guojun & Song, Ge & Lv, Xiaoxiang & Luo, Shikai & Shi, Chengchun & Zhu, Hongtu, 2023. "DNet: distributional network for distributional individualized treatment effects," LSE Research Online Documents on Economics 122895, London School of Economics and Political Science, LSE Library.
Ta-Wei Huang & Eva Ascarza, 2024. "Doing More with Less: Overcoming Ineffective Long-Term Targeting Using Short-Term Signals," Marketing Science, INFORMS, vol. 43(4), pages 863-884, July.
Yiyi Huo & Yingying Fan & Fang Han, 2023. "On the adaptation of causal forests to manifold data," Papers 2311.16486, arXiv.org, revised Dec 2023.
Bolbocean, Corneliu & Anderson, Peter J. & Bartmann, Peter & Cheong, Jeanie L.Y. & Doyle, Lex W. & Johnson, Samantha & Marlow, Neil & Wolke, Dieter & Petrou, Stavros & O'Neill, Stephen, 2025. "A heterogeneity analysis of health-related quality of life in early adults born very preterm or very low birthweight across the sociodemographic spectrum," Social Science & Medicine, Elsevier, vol. 380(C).
Patrick Rehill & Nicholas Biddle, 2024. "Heterogeneous treatment effect estimation with high-dimensional data in public policy evaluation -- an application to the conditioning of cash transfers in Morocco using causal machine learning," Papers 2401.07075, arXiv.org, revised Mar 2024.
Heejun Shin & Joseph Antonelli, 2023. "Improved inference for doubly robust estimators of heterogeneous treatment effects," Biometrics, The International Biometric Society, vol. 79(4), pages 3140-3152, December.
Phillip Heiler & Michael C. Knaus, 2025. "Heterogeneity Analysis with Heterogeneous Treatments," Papers 2507.01517, arXiv.org.
Michael C Knaus & Michael Lechner & Anthony Strittmatter, 2021. "Machine learning estimation of heterogeneous causal effects: Empirical Monte Carlo evidence," The Econometrics Journal, Royal Economic Society, vol. 24(1), pages 134-161.
- Knaus, Michael C. & Lechner, Michael & Strittmatter, Anthony, 2018. "Machine Learning Estimation of Heterogeneous Causal Effects: Empirical Monte Carlo Evidence," IZA Discussion Papers 12039, Institute of Labor Economics (IZA).
- Lechner, Michael & Knaus, Michael C. & Strittmatter, Anthony, 2018. "Machine Learning Estimation of Heterogeneous Causal Effects: Empirical Monte Carlo Evidence," CEPR Discussion Papers 13402, C.E.P.R. Discussion Papers.
- Knaus, Michael C. & Lechner, Michael & anthony.strittmatter@unisg.ch, 2018. "Machine Learning Estimation of Heterogeneous Causal Effects: Empirical Monte Carlo Evidence," Economics Working Paper Series 1817, University of St. Gallen, School of Economics and Political Science.
- Michael C. Knaus & Michael Lechner & Anthony Strittmatter, 2018. "Machine Learning Estimation of Heterogeneous Causal Effects: Empirical Monte Carlo Evidence," Papers 1810.13237, arXiv.org, revised Dec 2018.
Heiler, Phillip, 2024. "Heterogeneous treatment effect bounds under sample selection with an application to the effects of social media on political polarization," Journal of Econometrics, Elsevier, vol. 244(1).
- Phillip Heiler, 2022. "Heterogeneous Treatment Effect Bounds under Sample Selection with an Application to the Effects of Social Media on Political Polarization," Papers 2209.04329, arXiv.org, revised Jul 2024.

More about this item

Statistics

Access and download statistics

Corrections

All material on this site has been provided by the respective publishers and authors. You can help correct errors and omissions. When requesting a correction, please mention this item's handle: RePEc:arx:papers:2310.16945. See general information about how to correct material in RePEc.

If you have authored this item and are not yet registered with RePEc, we encourage you to do it here. This allows to link your profile to this item. It also allows you to accept potential citations to this item that we are uncertain about.

If CitEc recognized a bibliographic reference but did not link an item in RePEc to it, you can help with this form .

If you know of missing items citing this one, you can help us creating those links by adding the relevant references in the same way as above, for each refering item. If you are a registered author of this item, you may also want to check the "citations" tab in your RePEc Author Service profile, as there may be some citations waiting for confirmation.

For technical questions regarding this item, or to correct its authors, title, abstract, bibliographic or download information, contact: arXiv administrators (email available below). General contact details of provider: http://arxiv.org/ .

Please note that corrections may take a couple of weeks to filter through the various RePEc services.

IDEAS is a RePEc service. RePEc uses bibliographic data supplied by the respective publishers.

Browse Econ Literature

More features

Causal Q-Aggregation for CATE Model Selection

Author

Abstract

Suggested Citation

Download full text from publisher

References listed on IDEAS

Citations

Most related items

More about this item

Statistics

Corrections

More services and features

MyIDEAS

Author registration

Rankings

RePEc Genealogy

RePEc Biblio

MPRA

New papers by email

EconAcademics

Plagiarism

About RePEc

RePEc home

Blog

Help/FAQ

RePEc team

Participating archives

Privacy statement

Help us

Corrections

Volunteers

Get papers listed

Open a RePEc archive

Get RePEc data