IDEAS home Printed from https://ideas.repec.org/a/wly/emetrp/v85y2017ip233-298.html
   My bibliography  Save this article

Program Evaluation and Causal Inference With High‐Dimensional Data

Author

Listed:
  • A. Belloni
  • V. Chernozhukov
  • I. Fernández‐Val
  • C. Hansen

Abstract

In this paper, we provide efficient estimators and honest confidence bands for a variety of treatment effects including local average (LATE) and local quantile treatment effects (LQTE) in data‐rich environments. We can handle very many control variables, endogenous receipt of treatment, heterogeneous treatment effects, and function‐valued outcomes. Our framework covers the special case of exogenous receipt of treatment, either conditional on controls or unconditionally as in randomized control trials. In the latter case, our approach produces efficient estimators and honest bands for (functional) average treatment effects (ATE) and quantile treatment effects (QTE). To make informative inference possible, we assume that key reduced‐form predictive relationships are approximately sparse. This assumption allows the use of regularization and selection methods to estimate those relations, and we provide methods for post‐regularization and post‐selection inference that are uniformly valid (honest) across a wide range of models. We show that a key ingredient enabling honest inference is the use of orthogonal or doubly robust moment conditions in estimating certain reduced‐form functional parameters. We illustrate the use of the proposed methods with an application to estimating the effect of 401(k) eligibility and participation on accumulated assets. The results on program evaluation are obtained as a consequence of more general results on honest inference in a general moment‐condition framework, which arises from structural equation models in econometrics. Here, too, the crucial ingredient is the use of orthogonal moment conditions, which can be constructed from the initial moment conditions. We provide results on honest inference for (function‐valued) parameters within this general framework where any high‐quality, machine learning methods (e.g., boosted trees, deep neural networks, random forest, and their aggregated and hybrid versions) can be used to learn the nonparametric/high‐dimensional components of the model. These include a number of supporting auxiliary results that are of major independent interest: namely, we (1) prove uniform validity of a multiplier bootstrap, (2) offer a uniformly valid functional delta method, and (3) provide results for sparsity‐based estimation of regression functions for function‐valued outcomes.

Suggested Citation

  • A. Belloni & V. Chernozhukov & I. Fernández‐Val & C. Hansen, 2017. "Program Evaluation and Causal Inference With High‐Dimensional Data," Econometrica, Econometric Society, vol. 85, pages 233-298, January.
  • Handle: RePEc:wly:emetrp:v:85:y:2017:i::p:233-298
    as

    Download full text from publisher

    File URL: http://hdl.handle.net/
    Download Restriction: no
    ---><---

    Other versions of this item:

    References listed on IDEAS

    as
    1. Andrews, Donald W K, 1994. "Asymptotics for Semiparametric Econometric Models via Stochastic Equicontinuity," Econometrica, Econometric Society, vol. 62(1), pages 43-72, January.
    2. Guido W. Imbens & Whitney K. Newey, 2009. "Identification and Estimation of Triangular Simultaneous Equations Models Without Additivity," Econometrica, Econometric Society, vol. 77(5), pages 1481-1512, September.
    3. Cattaneo, Matias D., 2010. "Efficient semiparametric estimation of multi-valued treatment effects under ignorability," Journal of Econometrics, Elsevier, vol. 155(2), pages 138-154, April.
    4. Kline Patrick & Santos Andres, 2012. "A Score Based Approach to Wild Bootstrap Inference," Journal of Econometric Methods, De Gruyter, vol. 1(1), pages 23-41, August.
    5. Jeffrey M Wooldridge, 2010. "Econometric Analysis of Cross Section and Panel Data," MIT Press Books, The MIT Press, edition 2, volume 1, number 0262232588.
    6. Farrell, Max H., 2015. "Robust inference on average treatment effects with possibly more covariates than observations," Journal of Econometrics, Elsevier, vol. 189(1), pages 1-23.
    7. Linton, Oliver, 1996. "Edgeworth Approximation for MINPIN Estimators in Semiparametric Regression Models," Econometric Theory, Cambridge University Press, vol. 12(1), pages 30-60, March.
    8. Victor Chernozhukov & Christian Hansen & Martin Spindler, 2015. "Post-Selection and Post-Regularization Inference in Linear Models with Many Controls and Instruments," American Economic Review, American Economic Association, vol. 105(5), pages 486-490, May.
    9. Koenker, Roger, 1988. "Asymptotic Theory and Econometric Practice," Journal of Applied Econometrics, John Wiley & Sons, Ltd., vol. 3(2), pages 139-147, April.
    10. Fan J. & Li R., 2001. "Variable Selection via Nonconcave Penalized Likelihood and its Oracle Properties," Journal of the American Statistical Association, American Statistical Association, vol. 96, pages 1348-1360, December.
    11. Victor Chernozhukov & Denis Chetverikov & Mert Demirer & Esther Duflo & Christian Hansen & Whitney K. Newey, 2016. "Double machine learning for treatment and causal parameters," CeMMAP working papers CWP49/16, Centre for Microdata Methods and Practice, Institute for Fiscal Studies.
    12. Markus Frölich & Blaise Melly, 2013. "Identification of Treatment Effects on the Treated with One-Sided Non-Compliance," Econometric Reviews, Taylor & Francis Journals, vol. 32(3), pages 384-414, November.
    13. Hong, H. & Scaillet, O., 2006. "A fast subsampling method for nonlinear dynamic models," Journal of Econometrics, Elsevier, vol. 133(2), pages 557-578, August.
    14. Imbens,Guido W. & Rubin,Donald B., 2015. "Causal Inference for Statistics, Social, and Biomedical Sciences," Cambridge Books, Cambridge University Press, number 9780521885881, January.
    Full references (including those not matched with items on IDEAS)

    Most related items

    These are the items that most often cite the same works as this one and are cited by the same works as this one.
    1. Victor Chernozhukov & Ivan Fernandez-Val & Christian Hansen, 2013. "Program evaluation with high-dimensional data," CeMMAP working papers CWP57/13, Centre for Microdata Methods and Practice, Institute for Fiscal Studies.
    2. Victor Chernozhukov & Denis Chetverikov & Mert Demirer & Esther Duflo & Christian Hansen & Whitney Newey & James Robins, 2018. "Double/debiased machine learning for treatment and structural parameters," Econometrics Journal, Royal Economic Society, vol. 21(1), pages 1-68, February.
    3. Victor Chernozhukov & Denis Chetverikov & Mert Demirer & Esther Duflo & Christian Hansen & Whitney Newey & James Robins, 2016. "Double/Debiased Machine Learning for Treatment and Causal Parameters," Papers 1608.00060, arXiv.org, revised Dec 2017.
    4. Alexandre Belloni & Victor Chernozhukov & Denis Chetverikov & Christian Hansen & Kengo Kato, 2018. "High-dimensional econometrics and regularized GMM," CeMMAP working papers CWP35/18, Centre for Microdata Methods and Practice, Institute for Fiscal Studies.
    5. Dmitry Arkhangelsky & Guido Imbens, 2018. "The Role of the Propensity Score in Fixed Effect Models," NBER Working Papers 24814, National Bureau of Economic Research, Inc.
    6. Su, Liangjun & Ura, Takuya & Zhang, Yichong, 2019. "Non-separable models with high-dimensional data," Journal of Econometrics, Elsevier, vol. 212(2), pages 646-677.
    7. Zhong, Wei & Gao, Yang & Zhou, Wei & Fan, Qingliang, 2021. "Endogenous treatment effect estimation using high-dimensional instruments and double selection," Statistics & Probability Letters, Elsevier, vol. 169(C).
    8. Sant’Anna, Pedro H.C. & Zhao, Jun, 2020. "Doubly robust difference-in-differences estimators," Journal of Econometrics, Elsevier, vol. 219(1), pages 101-122.
    9. Victor Chernozhukov & Denis Chetverikov & Mert Demirer & Esther Duflo & Christian Hansen & Whitney K. Newey, 2016. "Double machine learning for treatment and causal parameters," CeMMAP working papers CWP49/16, Centre for Microdata Methods and Practice, Institute for Fiscal Studies.
    10. Bryan S. Graham & Cristine Campos de Xavier Pinto, 2018. "Semiparametrically efficient estimation of the average linear regression function," Papers 1810.12511, arXiv.org.
    11. Chernozhukov, Victor & Fernández-Val, Iván & Kowalski, Amanda E., 2015. "Quantile regression with censoring and endogeneity," Journal of Econometrics, Elsevier, vol. 186(1), pages 201-221.
    12. Rahul Singh & Liyuan Xu & Arthur Gretton, 2020. "Generalized Kernel Ridge Regression for Nonparametric Structural Functions and Semiparametric Treatment Effects," Papers 2010.04855, arXiv.org, revised Dec 2021.
    13. Phillip Heiler & Michael C. Knaus, 2021. "Effect or Treatment Heterogeneity? Policy Evaluation with Aggregated and Disaggregated Treatments," Papers 2110.01427, arXiv.org.
    14. Matias D Cattaneo & Michael Jansson & Xinwei Ma, 2019. "Two-Step Estimation and Inference with Possibly Many Included Covariates," Review of Economic Studies, Oxford University Press, vol. 86(3), pages 1095-1122.
    15. Susan Athey & Guido W. Imbens, 2017. "The State of Applied Econometrics: Causality and Policy Evaluation," Journal of Economic Perspectives, American Economic Association, vol. 31(2), pages 3-32, Spring.
    16. Andrea Morescalchi, 2021. "A new career in a new town. Job search methods and regional mobility of unemployed workers," Portuguese Economic Journal, Springer;Instituto Superior de Economia e Gestao, vol. 20(2), pages 223-272, May.
    17. Santiago Gallino & Antonio Moreno, 2018. "The Value of Fit Information in Online Retail: Evidence from a Randomized Field Experiment," Manufacturing & Service Operations Management, INFORMS, vol. 20(4), pages 767-787, October.
    18. Andrea Morescalchi, 2021. "A new career in a new town. Job search methods and regional mobility of unemployed workers," Portuguese Economic Journal, Springer;Instituto Superior de Economia e Gestao, vol. 20(2), pages 223-272, May.
    19. Lee, Ying-Ying, 2018. "Efficient propensity score regression estimators of multivalued treatment effects for the treated," Journal of Econometrics, Elsevier, vol. 204(2), pages 207-222.
    20. Andrea Morescalchi, 0. "A new career in a new town. Job search methods and regional mobility of unemployed workers," Portuguese Economic Journal, Springer;Instituto Superior de Economia e Gestao, vol. 0, pages 1-50.

    More about this item

    Statistics

    Access and download statistics

    Corrections

    All material on this site has been provided by the respective publishers and authors. You can help correct errors and omissions. When requesting a correction, please mention this item's handle: RePEc:wly:emetrp:v:85:y:2017:i::p:233-298. See general information about how to correct material in RePEc.

    For technical questions regarding this item, or to correct its authors, title, abstract, bibliographic or download information, contact: . General contact details of provider: https://edirc.repec.org/data/essssea.html .

    If you have authored this item and are not yet registered with RePEc, we encourage you to do it here. This allows to link your profile to this item. It also allows you to accept potential citations to this item that we are uncertain about.

    If CitEc recognized a bibliographic reference but did not link an item in RePEc to it, you can help with this form .

    If you know of missing items citing this one, you can help us creating those links by adding the relevant references in the same way as above, for each refering item. If you are a registered author of this item, you may also want to check the "citations" tab in your RePEc Author Service profile, as there may be some citations waiting for confirmation.

    For technical questions regarding this item, or to correct its authors, title, abstract, bibliographic or download information, contact: Wiley Content Delivery (email available below). General contact details of provider: https://edirc.repec.org/data/essssea.html .

    Please note that corrections may take a couple of weeks to filter through the various RePEc services.

    IDEAS is a RePEc service hosted by the Research Division of the Federal Reserve Bank of St. Louis . RePEc uses bibliographic data supplied by the respective publishers.