IDEAS home Printed from https://ideas.repec.org/p/arx/papers/1806.03467.html
   My bibliography  Save this paper

Orthogonal Random Forest for Causal Inference

Author

Listed:
  • Miruna Oprescu
  • Vasilis Syrgkanis
  • Zhiwei Steven Wu

Abstract

We propose the orthogonal random forest, an algorithm that combines Neyman-orthogonality to reduce sensitivity with respect to estimation error of nuisance parameters with generalized random forests (Athey et al., 2017)--a flexible non-parametric method for statistical estimation of conditional moment models using random forests. We provide a consistency rate and establish asymptotic normality for our estimator. We show that under mild assumptions on the consistency rate of the nuisance estimator, we can achieve the same error rate as an oracle with a priori knowledge of these nuisance parameters. We show that when the nuisance functions have a locally sparse parametrization, then a local $\ell_1$-penalized regression achieves the required rate. We apply our method to estimate heterogeneous treatment effects from observational data with discrete treatments or continuous treatments, and we show that, unlike prior work, our method provably allows to control for a high-dimensional set of variables under standard sparsity conditions. We also provide a comprehensive empirical evaluation of our algorithm on both synthetic and real data.

Suggested Citation

  • Miruna Oprescu & Vasilis Syrgkanis & Zhiwei Steven Wu, 2018. "Orthogonal Random Forest for Causal Inference," Papers 1806.03467, arXiv.org, revised Sep 2019.
  • Handle: RePEc:arx:papers:1806.03467
    as

    Download full text from publisher

    File URL: http://arxiv.org/pdf/1806.03467
    File Function: Latest version
    Download Restriction: no

    References listed on IDEAS

    as
    1. Victor Chernozhukov & Denis Chetverikov & Mert Demirer & Esther Duflo & Christian Hansen & Whitney Newey & James Robins, 2016. "Double/Debiased Machine Learning for Treatment and Causal Parameters," Papers 1608.00060, arXiv.org, revised Dec 2017.
    2. Xinkun Nie & Stefan Wager, 2017. "Quasi-Oracle Estimation of Heterogeneous Treatment Effects," Papers 1712.04912, arXiv.org, revised Aug 2020.
    3. Victor Chernozhukov & Denis Chetverikov & Mert Demirer & Esther Duflo & Christian Hansen & Whitney Newey & James Robins, 2018. "Double/debiased machine learning for treatment and structural parameters," Econometrics Journal, Royal Economic Society, vol. 21(1), pages 1-68, February.
    4. Robinson, Peter M, 1988. "Root- N-Consistent Semiparametric Regression," Econometrica, Econometric Society, vol. 56(4), pages 931-954, July.
    5. Arcones, Miguel A., 1995. "A Bernstein-type inequality for U-statistics and U-processes," Statistics & Probability Letters, Elsevier, vol. 22(3), pages 239-247, February.
    6. Imbens,Guido W. & Rubin,Donald B., 2015. "Causal Inference for Statistics, Social, and Biomedical Sciences," Cambridge Books, Cambridge University Press, number 9780521885881, December.
    7. Victor Chernozhukov & Denis Chetverikov & Mert Demirer & Esther Duflo & Christian Hansen & Whitney Newey, 2017. "Double/Debiased/Neyman Machine Learning of Treatment Effects," American Economic Review, American Economic Association, vol. 107(5), pages 261-265, May.
    Full references (including those not matched with items on IDEAS)

    Citations

    Citations are extracted by the CitEc Project, subscribe to its RSS feed for this item.
    as


    Cited by:

    1. Kyle Colangelo & Ying-Ying Lee, 2019. "Double debiased machine learning nonparametric inference with continuous treatments," CeMMAP working papers CWP54/19, Centre for Microdata Methods and Practice, Institute for Fiscal Studies.
    2. Lechner, Michael, 2018. "Modified Causal Forests for Estimating Heterogeneous Causal Effects," IZA Discussion Papers 12040, Institute of Labor Economics (IZA).
    3. Kyle Colangelo & Ying-Ying Lee, 2020. "Double Debiased Machine Learning Nonparametric Inference with Continuous Treatments," Papers 2004.03036, arXiv.org, revised Oct 2020.
    4. Dylan J. Foster & Vasilis Syrgkanis, 2019. "Orthogonal Statistical Learning," Papers 1901.09036, arXiv.org, revised Sep 2020.
    5. Knaus, Michael C. & Lechner, Michael & Strittmatter, Anthony, 2018. "Machine Learning Estimation of Heterogeneous Causal Effects: Empirical Monte Carlo Evidence," IZA Discussion Papers 12039, Institute of Labor Economics (IZA).
    6. Yiyan Huang & Cheuk Hang Leung & Xing Yan & Qi Wu & Nanbo Peng & Dongdong Wang & Zhixiang Huang, 2020. "The Causal Learning of Retail Delinquency," Papers 2012.09448, arXiv.org.
    7. Rahul Singh & Liyuan Xu & Arthur Gretton, 2020. "Kernel Methods for Policy Evaluation: Treatment Effects, Mediation Analysis, and Off-Policy Planning," Papers 2010.04855, arXiv.org, revised Oct 2020.
    8. Gubela, Robin M. & Lessmann, Stefan & Jaroszewicz, Szymon, 2020. "Response transformation and profit decomposition for revenue uplift modeling," European Journal of Operational Research, Elsevier, vol. 283(2), pages 647-661.
    9. Krikamol Muandet & Wittawat Jitkrittum & Jonas Kubler, 2020. "Kernel Conditional Moment Test via Maximum Moment Restriction," Papers 2002.09225, arXiv.org, revised Jun 2020.
    10. Rahul Singh, 2020. "Kernel Methods for Unobserved Confounding: Negative Controls, Proxies, and Instruments," Papers 2012.10315, arXiv.org.

    Most related items

    These are the items that most often cite the same works as this one and are cited by the same works as this one.
    1. Guido Imbens, 2019. "Potential Outcome and Directed Acyclic Graph Approaches to Causality: Relevance for Empirical Practice in Economics," NBER Working Papers 26104, National Bureau of Economic Research, Inc.
    2. Marica Valente, 2020. "Heterogeneous effects of waste pricing policies," Papers 2010.01105, arXiv.org, revised Oct 2020.
    3. Jiaming Mao & Jingzhi Xu, 2020. "Ensemble Learning with Statistical and Structural Models," Papers 2006.05308, arXiv.org.
    4. Heigle, Julia & Pfeiffer, Friedhelm, 2019. "An analysis of selected labor market outcomes of college dropouts in Germany: A machine learning estimation approach. Research report," ZEW Expertises, ZEW - Leibniz Centre for European Economic Research, number 222378.
    5. Amit Sharma & Emre Kiciman, 2020. "DoWhy: An End-to-End Library for Causal Inference," Papers 2011.04216, arXiv.org.
    6. Whitney K. Newey & James M. Robins, 2017. "Cross-fitting and fast remainder rates for semiparametric estimation," CeMMAP working papers CWP41/17, Centre for Microdata Methods and Practice, Institute for Fiscal Studies.
    7. Phillip Heiler, 2020. "Efficient Covariate Balancing for the Local Average Treatment Effect," Papers 2007.04346, arXiv.org.
    8. Sant’Anna, Pedro H.C. & Zhao, Jun, 2020. "Doubly robust difference-in-differences estimators," Journal of Econometrics, Elsevier, vol. 219(1), pages 101-122.
    9. Huber Martin & Wüthrich Kaspar, 2019. "Local Average and Quantile Treatment Effects Under Endogeneity: A Review," Journal of Econometric Methods, De Gruyter, vol. 8(1), pages 1-27, January.
    10. Sander Gerritsen & Mark Kattenberg & Sonny Kuijpers, 2019. "The impact of age at arrival on education and mental health," CPB Discussion Paper 389.rdf, CPB Netherlands Bureau for Economic Policy Analysis.
    11. Sander Gerritsen & Mark Kattenberg & Sonny Kuijpers, 2019. "The impact of age at arrival on education and mental health," CPB Discussion Paper 389, CPB Netherlands Bureau for Economic Policy Analysis.
    12. Duncan Simester & Artem Timoshenko & Spyros I. Zoumpoulis, 2020. "Targeting Prospective Customers: Robustness of Machine-Learning Methods to Typical Data Challenges," Management Science, INFORMS, vol. 66(6), pages 2495-2522, June.
    13. Sven Klaassen & Jannis Kuck & Martin Spindler & Victor Chernozhukov, 2018. "Uniform Inference in High-Dimensional Gaussian Graphical Models," Papers 1808.10532, arXiv.org, revised Dec 2018.
    14. Victor Chernozhukov & Whitney Newey & Vira Semenova, 2019. "Inference on weighted average value function in high-dimensional state space," Papers 1908.09173, arXiv.org.
    15. Vira Semenova, 2018. "Machine Learning for Dynamic Discrete Choice," Papers 1808.02569, arXiv.org, revised Nov 2018.
    16. Maria Cuellar & Edward H. Kennedy, 2020. "A non‐parametric projection‐based estimator for the probability of causation, with application to water sanitation in Kenya," Journal of the Royal Statistical Society Series A, Royal Statistical Society, vol. 183(4), pages 1793-1818, October.
    17. Andres Algaba & David Ardia & Keven Bluteau & Samuel Borms & Kris Boudt, 2020. "Econometrics Meets Sentiment: An Overview Of Methodology And Applications," Journal of Economic Surveys, Wiley Blackwell, vol. 34(3), pages 512-547, July.
    18. Jong Hee Park & Byung Koo Kim, 2020. "Why your neighbor matters: Positions in preferential trade agreement networks and export growth in global value chains," Economics and Politics, Wiley Blackwell, vol. 32(3), pages 381-410, November.
    19. Monica Andini & Emanuele Ciani & Guido de Blasio & Alessio D'Ignazio & Viola Salvestrini, 2017. "Targeting policy-compliers with machine learning: an application to a tax rebate programme in Italy," Temi di discussione (Economic working papers) 1158, Bank of Italy, Economic Research and International Relations Area.
    20. Francesco Decarolis & Cristina Giorgiantonio, 2020. "Corruption red flags in public procurement: new evidence from Italian calls for tenders," Questioni di Economia e Finanza (Occasional Papers) 544, Bank of Italy, Economic Research and International Relations Area.

    More about this item

    NEP fields

    This paper has been announced in the following NEP Reports:

    Statistics

    Access and download statistics

    Corrections

    All material on this site has been provided by the respective publishers and authors. You can help correct errors and omissions. When requesting a correction, please mention this item's handle: RePEc:arx:papers:1806.03467. See general information about how to correct material in RePEc.

    For technical questions regarding this item, or to correct its authors, title, abstract, bibliographic or download information, contact: (arXiv administrators). General contact details of provider: http://arxiv.org/ .

    If you have authored this item and are not yet registered with RePEc, we encourage you to do it here. This allows to link your profile to this item. It also allows you to accept potential citations to this item that we are uncertain about.

    If CitEc recognized a reference but did not link an item in RePEc to it, you can help with this form .

    If you know of missing items citing this one, you can help us creating those links by adding the relevant references in the same way as above, for each refering item. If you are a registered author of this item, you may also want to check the "citations" tab in your RePEc Author Service profile, as there may be some citations waiting for confirmation.

    Please note that corrections may take a couple of weeks to filter through the various RePEc services.

    IDEAS is a RePEc service hosted by the Research Division of the Federal Reserve Bank of St. Louis . RePEc uses bibliographic data supplied by the respective publishers.