IDEAS home Printed from
   My bibliography  Save this paper

Robust Inference on Average Treatment Effects with Possibly More Covariates than Observations


  • Max H. Farrell


This paper concerns robust inference on average treatment effects following model selection. In the selection on observables framework, we show how to construct confidence intervals based on a doubly-robust estimator that are robust to model selection errors and prove that they are valid uniformly over a large class of treatment effect models. The class allows for multivalued treatments with heterogeneous effects (in observables), general heteroskedasticity, and selection amongst (possibly) more covariates than observations. Our estimator attains the semiparametric efficiency bound under appropriate conditions. Precise conditions are given for any model selector to yield these results, and we show how to combine data-driven selection with economic theory. For implementation, we give a specific proposal for selection based on the group lasso, which is particularly well-suited to treatment effects data, and derive new results for high-dimensional, sparse multinomial logistic regression. A simulation study shows our estimator performs very well in finite samples over a wide range of models. Revisiting the National Supported Work demonstration data, our method yields accurate estimates and tight confidence intervals.

Suggested Citation

  • Max H. Farrell, 2013. "Robust Inference on Average Treatment Effects with Possibly More Covariates than Observations," Papers 1309.4686,, revised Feb 2018.
  • Handle: RePEc:arx:papers:1309.4686

    Download full text from publisher

    File URL:
    File Function: Latest version
    Download Restriction: no

    References listed on IDEAS

    1. Alexandre Belloni & Victor Chernozhukov, 2009. "L1-Penalized Quantile Regression in High-Dimensional Sparse Models," Papers 0904.2931,, revised Sep 2019.
    2. LaLonde, Robert J, 1986. "Evaluating the Econometric Evaluations of Training Programs with Experimental Data," American Economic Review, American Economic Association, vol. 76(4), pages 604-620, September.
    3. A. Belloni & D. Chen & V. Chernozhukov & C. Hansen, 2012. "Sparse Models and Methods for Optimal Instruments With an Application to Eminent Domain," Econometrica, Econometric Society, vol. 80(6), pages 2369-2429, November.
    4. Zou, Hui, 2006. "The Adaptive Lasso and Its Oracle Properties," Journal of the American Statistical Association, American Statistical Association, vol. 101, pages 1418-1429, December.
    5. Guido W. Imbens & Jeffrey M. Wooldridge, 2009. "Recent Developments in the Econometrics of Program Evaluation," Journal of Economic Literature, American Economic Association, vol. 47(1), pages 5-86, March.
    6. A. Smith, Jeffrey & E. Todd, Petra, 2005. "Does matching overcome LaLonde's critique of nonexperimental estimators?," Journal of Econometrics, Elsevier, vol. 125(1-2), pages 305-353.
    7. Zhiqiang Tan, 2010. "Bounded, efficient and doubly robust estimation with inverse weighting," Biometrika, Biometrika Trust, vol. 97(3), pages 661-682.
    8. Kosuke Imai & David A. van Dyk, 2004. "Causal Inference With General Treatment Regimes: Generalizing the Propensity Score," Journal of the American Statistical Association, American Statistical Association, vol. 99, pages 854-866, January.
    9. Vincent, Martin & Hansen, Niels Richard, 2014. "Sparse group lasso and high dimensional multinomial classification," Computational Statistics & Data Analysis, Elsevier, vol. 71(C), pages 771-786.
    10. James J. Heckman & Hidehiko Ichimura & Petra E. Todd, 1997. "Matching As An Econometric Evaluation Estimator: Evidence from Evaluating a Job Training Programme," Review of Economic Studies, Oxford University Press, vol. 64(4), pages 605-654.
    11. Bradley Efron, 2014. "Estimation and Accuracy After Model Selection," Journal of the American Statistical Association, Taylor & Francis Journals, vol. 109(507), pages 991-1007, September.
    12. Cattaneo, Matias D. & Drukker, David M. & Holland, Ashley D., 2013. "Estimation of multivalued treatment effects under conditional independence," Stata Journal, StataCorp LP, vol. 13(3), pages 1-46.
    13. Alexandre Belloni & Victor Chernozhukov & Christian Hansen, 2014. "Inference on Treatment Effects after Selection among High-Dimensional Controlsâ€," Review of Economic Studies, Oxford University Press, vol. 81(2), pages 608-650.
    14. Belloni, Alexandre & Chernozhukov, Victor & Chetverikov, Denis & Kato, Kengo, 2015. "Some new asymptotic theory for least squares series: Pointwise and uniform results," Journal of Econometrics, Elsevier, vol. 186(2), pages 345-366.
    15. Andrews, Donald W.K. & Guggenberger, Patrik, 2009. "Incorrect asymptotic size of subsampling procedures based on post-consistent model selection estimators," Journal of Econometrics, Elsevier, vol. 152(1), pages 19-27, September.
    16. Alberto Abadie, 2005. "Semiparametric Difference-in-Differences Estimators," Review of Economic Studies, Oxford University Press, vol. 72(1), pages 1-19.
    17. Matias D. Cattaneo & Richard K. Crump & Michael Jansson, 2013. "Generalized Jackknife Estimators of Weighted Average Derivatives," Journal of the American Statistical Association, Taylor & Francis Journals, vol. 108(504), pages 1243-1256, December.
    18. Cattaneo, Matias D., 2010. "Efficient semiparametric estimation of multi-valued treatment effects under ignorability," Journal of Econometrics, Elsevier, vol. 155(2), pages 138-154, April.
    19. Keisuke Hirano & Guido W. Imbens & Geert Ridder, 2003. "Efficient Estimation of Average Treatment Effects Using the Estimated Propensity Score," Econometrica, Econometric Society, vol. 71(4), pages 1161-1189, July.
    20. Newey, Whitney K, 1990. "Efficient Instrumental Variables Estimation of Nonlinear Models," Econometrica, Econometric Society, vol. 58(4), pages 809-837, July.
    21. He, Xuming & Shao, Qi-Man, 2000. "On Parameters of Increasing Dimensions," Journal of Multivariate Analysis, Elsevier, vol. 73(1), pages 120-135, April.
    22. Jinyong Hahn, 1998. "On the Role of the Propensity Score in Efficient Semiparametric Estimation of Average Treatment Effects," Econometrica, Econometric Society, vol. 66(2), pages 315-332, March.
    23. Heejung Bang & James M. Robins, 2005. "Doubly Robust Estimation in Missing Data and Causal Inference Models," Biometrics, The International Biometric Society, vol. 61(4), pages 962-973, December.
    24. Halbert White & Xun Lu, 2011. "Causal Diagrams for Treatment Effect Estimation with Application to Efficient Covariate Selection," The Review of Economics and Statistics, MIT Press, vol. 93(4), pages 1453-1459, November.
    25. Pötscher, Benedikt M. & Leeb, Hannes, 2009. "On the distribution of penalized maximum likelihood estimators: The LASSO, SCAD, and thresholding," Journal of Multivariate Analysis, Elsevier, vol. 100(9), pages 2065-2082, October.
    26. Chen, Xiaohong & Christensen, Timothy M., 2015. "Optimal uniform convergence rates and asymptotic normality for series estimators under weak dependence and weak conditions," Journal of Econometrics, Elsevier, vol. 188(2), pages 447-465.
    27. Wooldridge, Jeffrey M., 2007. "Inverse probability weighted estimation for general missing data problems," Journal of Econometrics, Elsevier, vol. 141(2), pages 1281-1301, December.
    28. Joseph P. Romano, 2004. "On Non‐parametric Testing, the Uniform Behaviour of the t‐test, and Related Problems," Scandinavian Journal of Statistics, Danish Society for Theoretical Statistics;Finnish Statistical Society;Norwegian Statistical Association;Swedish Statistical Association, vol. 31(4), pages 567-584, December.
    29. Alexandre Belloni & Victor Chernozhukov & Ying Wei, 2013. "Honest confidence regions for a regression parameter in logistic regression with a large number of controls," CeMMAP working papers CWP67/13, Centre for Microdata Methods and Practice, Institute for Fiscal Studies.
    30. Leeb, Hannes & Pötscher, Benedikt M., 2005. "Model Selection And Inference: Facts And Fiction," Econometric Theory, Cambridge University Press, vol. 21(1), pages 21-59, February.
    31. Ming Yuan & Yi Lin, 2006. "Model selection and estimation in regression with grouped variables," Journal of the Royal Statistical Society Series B, Royal Statistical Society, vol. 68(1), pages 49-67, February.
    32. Cun-Hui Zhang & Stephanie S. Zhang, 2014. "Confidence intervals for low dimensional parameters in high dimensional linear models," Journal of the Royal Statistical Society Series B, Royal Statistical Society, vol. 76(1), pages 217-242, January.
    Full references (including those not matched with items on IDEAS)

    More about this item


    Access and download statistics


    All material on this site has been provided by the respective publishers and authors. You can help correct errors and omissions. When requesting a correction, please mention this item's handle: RePEc:arx:papers:1309.4686. See general information about how to correct material in RePEc.

    For technical questions regarding this item, or to correct its authors, title, abstract, bibliographic or download information, contact: (arXiv administrators). General contact details of provider: .

    If you have authored this item and are not yet registered with RePEc, we encourage you to do it here. This allows to link your profile to this item. It also allows you to accept potential citations to this item that we are uncertain about.

    If CitEc recognized a reference but did not link an item in RePEc to it, you can help with this form .

    If you know of missing items citing this one, you can help us creating those links by adding the relevant references in the same way as above, for each refering item. If you are a registered author of this item, you may also want to check the "citations" tab in your RePEc Author Service profile, as there may be some citations waiting for confirmation.

    Please note that corrections may take a couple of weeks to filter through the various RePEc services.

    IDEAS is a RePEc service hosted by the Research Division of the Federal Reserve Bank of St. Louis . RePEc uses bibliographic data supplied by the respective publishers.