IDEAS home Printed from https://ideas.repec.org/a/wly/emetrp/v85y2017ip233-298.html
   My bibliography  Save this article

Program Evaluation and Causal Inference With High‐Dimensional Data

Author

Listed:
  • A. Belloni
  • V. Chernozhukov
  • I. Fernández‐Val
  • C. Hansen

Abstract

In this paper, we provide efficient estimators and honest confidence bands for a variety of treatment effects including local average (LATE) and local quantile treatment effects (LQTE) in data‐rich environments. We can handle very many control variables, endogenous receipt of treatment, heterogeneous treatment effects, and function‐valued outcomes. Our framework covers the special case of exogenous receipt of treatment, either conditional on controls or unconditionally as in randomized control trials. In the latter case, our approach produces efficient estimators and honest bands for (functional) average treatment effects (ATE) and quantile treatment effects (QTE). To make informative inference possible, we assume that key reduced‐form predictive relationships are approximately sparse. This assumption allows the use of regularization and selection methods to estimate those relations, and we provide methods for post‐regularization and post‐selection inference that are uniformly valid (honest) across a wide range of models. We show that a key ingredient enabling honest inference is the use of orthogonal or doubly robust moment conditions in estimating certain reduced‐form functional parameters. We illustrate the use of the proposed methods with an application to estimating the effect of 401(k) eligibility and participation on accumulated assets. The results on program evaluation are obtained as a consequence of more general results on honest inference in a general moment‐condition framework, which arises from structural equation models in econometrics. Here, too, the crucial ingredient is the use of orthogonal moment conditions, which can be constructed from the initial moment conditions. We provide results on honest inference for (function‐valued) parameters within this general framework where any high‐quality, machine learning methods (e.g., boosted trees, deep neural networks, random forest, and their aggregated and hybrid versions) can be used to learn the nonparametric/high‐dimensional components of the model. These include a number of supporting auxiliary results that are of major independent interest: namely, we (1) prove uniform validity of a multiplier bootstrap, (2) offer a uniformly valid functional delta method, and (3) provide results for sparsity‐based estimation of regression functions for function‐valued outcomes.

Suggested Citation

  • A. Belloni & V. Chernozhukov & I. Fernández‐Val & C. Hansen, 2017. "Program Evaluation and Causal Inference With High‐Dimensional Data," Econometrica, Econometric Society, vol. 85, pages 233-298, January.
  • Handle: RePEc:wly:emetrp:v:85:y:2017:i::p:233-298
    as

    Download full text from publisher

    File URL: http://hdl.handle.net/
    Download Restriction: no

    Other versions of this item:

    References listed on IDEAS

    as
    1. Cattaneo, Matias D., 2010. "Efficient semiparametric estimation of multi-valued treatment effects under ignorability," Journal of Econometrics, Elsevier, vol. 155(2), pages 138-154, April.
    2. Kline Patrick & Santos Andres, 2012. "A Score Based Approach to Wild Bootstrap Inference," Journal of Econometric Methods, De Gruyter, vol. 1(1), pages 23-41, August.
    3. Linton, Oliver, 1996. "Edgeworth Approximation for MINPIN Estimators in Semiparametric Regression Models," Econometric Theory, Cambridge University Press, vol. 12(01), pages 30-60, March.
    4. Koenker, Roger, 1988. "Asymptotic Theory and Econometric Practice," Journal of Applied Econometrics, John Wiley & Sons, Ltd., vol. 3(2), pages 139-147, April.
    5. Fan J. & Li R., 2001. "Variable Selection via Nonconcave Penalized Likelihood and its Oracle Properties," Journal of the American Statistical Association, American Statistical Association, vol. 96, pages 1348-1360, December.
    6. Markus Frölich & Blaise Melly, 2013. "Identification of Treatment Effects on the Treated with One-Sided Non-Compliance," Econometric Reviews, Taylor & Francis Journals, vol. 32(3), pages 384-414, November.
    7. Jeffrey M Wooldridge, 2010. "Econometric Analysis of Cross Section and Panel Data," MIT Press Books, The MIT Press, edition 2, volume 1, number 0262232588, March.
    8. Andrews, Donald W K, 1994. "Asymptotics for Semiparametric Econometric Models via Stochastic Equicontinuity," Econometrica, Econometric Society, vol. 62(1), pages 43-72, January.
    9. Guido W. Imbens & Whitney K. Newey, 2009. "Identification and Estimation of Triangular Simultaneous Equations Models Without Additivity," Econometrica, Econometric Society, vol. 77(5), pages 1481-1512, September.
    10. Farrell, Max H., 2015. "Robust inference on average treatment effects with possibly more covariates than observations," Journal of Econometrics, Elsevier, vol. 189(1), pages 1-23.
    11. Victor Chernozhukov & Christian Hansen & Martin Spindler, 2015. "Post-Selection and Post-Regularization Inference in Linear Models with Many Controls and Instruments," American Economic Review, American Economic Association, vol. 105(5), pages 486-490, May.
    12. Victor Chernozhukov & Denis Chetverikov & Mert Demirer & Esther Duflo & Christian Hansen & Whitney K. Newey, 2016. "Double machine learning for treatment and causal parameters," CeMMAP working papers CWP49/16, Centre for Microdata Methods and Practice, Institute for Fiscal Studies.
    13. Hong, H. & Scaillet, O., 2006. "A fast subsampling method for nonlinear dynamic models," Journal of Econometrics, Elsevier, vol. 133(2), pages 557-578, August.
    14. Imbens,Guido W. & Rubin,Donald B., 2015. "Causal Inference for Statistics, Social, and Biomedical Sciences," Cambridge Books, Cambridge University Press, number 9780521885881.
    Full references (including those not matched with items on IDEAS)

    More about this item

    Statistics

    Access and download statistics

    Corrections

    All material on this site has been provided by the respective publishers and authors. You can help correct errors and omissions. When requesting a correction, please mention this item's handle: RePEc:wly:emetrp:v:85:y:2017:i::p:233-298. See general information about how to correct material in RePEc.

    For technical questions regarding this item, or to correct its authors, title, abstract, bibliographic or download information, contact: (Wiley Content Delivery). General contact details of provider: http://edirc.repec.org/data/essssea.html .

    If you have authored this item and are not yet registered with RePEc, we encourage you to do it here. This allows to link your profile to this item. It also allows you to accept potential citations to this item that we are uncertain about.

    If CitEc recognized a reference but did not link an item in RePEc to it, you can help with this form .

    If you know of missing items citing this one, you can help us creating those links by adding the relevant references in the same way as above, for each refering item. If you are a registered author of this item, you may also want to check the "citations" tab in your RePEc Author Service profile, as there may be some citations waiting for confirmation.

    Please note that corrections may take a couple of weeks to filter through the various RePEc services.

    IDEAS is a RePEc service hosted by the Research Division of the Federal Reserve Bank of St. Louis . RePEc uses bibliographic data supplied by the respective publishers.