IDEAS home Printed from https://ideas.repec.org/a/oup/restud/v86y2019i3p1095-1122..html
   My bibliography  Save this article

Two-Step Estimation and Inference with Possibly Many Included Covariates

Author

Listed:
  • Matias D Cattaneo
  • Michael Jansson
  • Xinwei Ma

Abstract

We study the implications of including many covariates in a first-step estimate entering a two-step estimation procedure. We find that a first-order bias emerges when the number of included covariates is “large” relative to the square-root of sample size, rendering standard inference procedures invalid. We show that the jackknife is able to estimate this “many covariates” bias consistently, thereby delivering a new automatic bias-corrected two-step point estimator. The jackknife also consistently estimates the standard error of the original two-step point estimator. For inference, we develop a valid post-bias-correction bootstrap approximation that accounts for the additional variability introduced by the jackknife bias-correction. We find that the jackknife bias-corrected point estimator and the bootstrap post-bias-correction inference perform excellent in simulations, offering important improvements over conventional two-step point estimators and inference procedures, which are not robust to including many covariates. We apply our results to an array of distinct treatment effect, policy evaluation, and other applied microeconomics settings. In particular, we discuss production function and marginal treatment effect estimation in detail.

Suggested Citation

  • Matias D Cattaneo & Michael Jansson & Xinwei Ma, 2019. "Two-Step Estimation and Inference with Possibly Many Included Covariates," The Review of Economic Studies, Review of Economic Studies Ltd, vol. 86(3), pages 1095-1122.
  • Handle: RePEc:oup:restud:v:86:y:2019:i:3:p:1095-1122.
    as

    Download full text from publisher

    File URL: http://hdl.handle.net/10.1093/restud/rdy053
    Download Restriction: Access to full text is restricted to subscribers.
    ---><---

    As the access to this document is restricted, you may want to look for a different version below or search for a different version of it.

    Other versions of this item:

    References listed on IDEAS

    as
    1. repec:clg:wpaper:2013-20 is not listed on IDEAS
    2. Victor Chernozhukov & Denis Chetverikov & Mert Demirer & Esther Duflo & Christian Hansen & Whitney Newey & James Robins, 2018. "Double/debiased machine learning for treatment and structural parameters," Econometrics Journal, Royal Economic Society, vol. 21(1), pages 1-68, February.
    3. Jinyong Hahn & Geert Ridder, 2013. "Asymptotic Variance of Semiparametric Estimators With Generated Regressors," Econometrica, Econometric Society, vol. 81(1), pages 315-340, January.
    4. Cattaneo, Matias D. & Crump, Richard K. & Jansson, Michael, 2014. "Bootstrapping Density-Weighted Average Derivatives," Econometric Theory, Cambridge University Press, vol. 30(6), pages 1135-1164, December.
    5. James J. Heckman & Edward Vytlacil, 2005. "Structural Equations, Treatment Effects, and Econometric Policy Evaluation," Econometrica, Econometric Society, vol. 73(3), pages 669-738, May.
    6. Cattaneo, Matias D. & Jansson, Michael & Newey, Whitney K., 2018. "Alternative Asymptotics And The Partially Linear Model With Many Regressors," Econometric Theory, Cambridge University Press, vol. 34(2), pages 277-301, April.
    7. Matthew D. Webb, 2023. "Reworking wild bootstrap‐based inference for clustered errors," Canadian Journal of Economics/Revue canadienne d'économique, John Wiley & Sons, vol. 56(3), pages 839-858, August.
    8. Matias D. Cattaneo & Michael Jansson & Whitney K. Newey, 2018. "Inference in Linear Regression Models with Many Covariates and Heteroscedasticity," Journal of the American Statistical Association, Taylor & Francis Journals, vol. 113(523), pages 1350-1361, July.
    9. Cattaneo, Matias D., 2010. "Efficient semiparametric estimation of multi-valued treatment effects under ignorability," Journal of Econometrics, Elsevier, vol. 155(2), pages 138-154, April.
    10. Pedro Carneiro & James J. Heckman & Edward J. Vytlacil, 2011. "Estimating Marginal Returns to Education," American Economic Review, American Economic Association, vol. 101(6), pages 2754-2781, October.
    11. Jeffrey M Wooldridge, 2010. "Econometric Analysis of Cross Section and Panel Data," MIT Press Books, The MIT Press, edition 2, volume 1, number 0262232588, December.
    12. Victor Chernozhukov & Juan Carlos Escanciano & Hidehiko Ichimura & Whitney K. Newey & James M. Robins, 2022. "Locally Robust Semiparametric Estimation," Econometrica, Econometric Society, vol. 90(4), pages 1501-1535, July.
    13. Sebastian Calonico & Matias D. Cattaneo & Max H. Farrell, 2018. "On the Effect of Bias Estimation on Coverage Accuracy in Nonparametric Inference," Journal of the American Statistical Association, Taylor & Francis Journals, vol. 113(522), pages 767-779, April.
    14. Ackerberg, Daniel & Lanier Benkard, C. & Berry, Steven & Pakes, Ariel, 2007. "Econometric Tools for Analyzing Market Outcomes," Handbook of Econometrics, in: J.J. Heckman & E.E. Leamer (ed.), Handbook of Econometrics, edition 1, volume 6, chapter 63, Elsevier.
    15. Farrell, Max H., 2015. "Robust inference on average treatment effects with possibly more covariates than observations," Journal of Econometrics, Elsevier, vol. 189(1), pages 1-23.
    16. Kline, Patrick & Santos, Andres, 2012. "Higher order properties of the wild bootstrap under misspecification," Journal of Econometrics, Elsevier, vol. 171(1), pages 54-70.
    17. Alexandre Belloni & Victor Chernozhukov & Christian Hansen, 2014. "Inference on Treatment Effects after Selection among High-Dimensional Controlsâ€," The Review of Economic Studies, Review of Economic Studies Ltd, vol. 81(2), pages 608-650.
    18. Abadie, Alberto, 2003. "Semiparametric instrumental variable estimation of treatment response models," Journal of Econometrics, Elsevier, vol. 113(2), pages 231-263, April.
    19. James J. Heckman & Sergio Urzua & Edward Vytlacil, 2006. "Understanding Instrumental Variables in Models with Essential Heterogeneity," The Review of Economics and Statistics, MIT Press, vol. 88(3), pages 389-432, August.
    20. Matias D. Cattaneo & Richard K. Crump & Michael Jansson, 2013. "Generalized Jackknife Estimators of Weighted Average Derivatives," Journal of the American Statistical Association, Taylor & Francis Journals, vol. 108(504), pages 1243-1256, December.
    21. Cattaneo, Matias D. & Crump, Richard K. & Jansson, Michael, 2014. "Small Bandwidth Asymptotics For Density-Weighted Average Derivatives," Econometric Theory, Cambridge University Press, vol. 30(1), pages 176-200, February.
    22. Cattaneo, Matias D. & Crump, Richard K. & Jansson, Michael, 2010. "Robust Data-Driven Inference for Density-Weighted Average Derivatives," Journal of the American Statistical Association, American Statistical Association, vol. 105(491), pages 1070-1083.
    23. Iván Fernández-Val & Martin Weidner, 2018. "Fixed Effects Estimation of Large-TPanel Data Models," Annual Review of Economics, Annual Reviews, vol. 10(1), pages 109-138, August.
    24. Heejung Bang & James M. Robins, 2005. "Doubly Robust Estimation in Missing Data and Causal Inference Models," Biometrics, The International Biometric Society, vol. 61(4), pages 962-973, December.
    25. Newey, Whitney K, 1994. "The Asymptotic Variance of Semiparametric Estimators," Econometrica, Econometric Society, vol. 62(6), pages 1349-1382, November.
    26. Matias D. Cattaneo, 2010. "multi-valued treatment effects," The New Palgrave Dictionary of Economics,, Palgrave Macmillan.
    27. Ichimura, Hidehiko & Todd, Petra E., 2007. "Implementing Nonparametric and Semiparametric Estimators," Handbook of Econometrics, in: J.J. Heckman & E.E. Leamer (ed.), Handbook of Econometrics, edition 1, volume 6, chapter 74, Elsevier.
    28. Olley, G Steven & Pakes, Ariel, 1996. "The Dynamics of Productivity in the Telecommunications Equipment Industry," Econometrica, Econometric Society, vol. 64(6), pages 1263-1297, November.
    29. Belloni, Alexandre & Chernozhukov, Victor & Chetverikov, Denis & Kato, Kengo, 2015. "Some new asymptotic theory for least squares series: Pointwise and uniform results," Journal of Econometrics, Elsevier, vol. 186(2), pages 345-366.
    30. Matias D. Cattaneo & Michael Jansson, 2018. "Kernel†Based Semiparametric Estimators: Small Bandwidth Asymptotics and Bootstrap Consistency," Econometrica, Econometric Society, vol. 86(3), pages 955-995, May.
    31. Chen, Xiaohong, 2007. "Large Sample Sieve Estimation of Semi-Nonparametric Models," Handbook of Econometrics, in: J.J. Heckman & E.E. Leamer (ed.), Handbook of Econometrics, edition 1, volume 6, chapter 76, Elsevier.
    32. A. Belloni & V. Chernozhukov & I. Fernández‐Val & C. Hansen, 2017. "Program Evaluation and Causal Inference With High‐Dimensional Data," Econometrica, Econometric Society, vol. 85, pages 233-298, January.
    33. Jianqing Fan & Jinchi Lv & Lei Qi, 2011. "Sparse High-Dimensional Models in Economics," Annual Review of Economics, Annual Reviews, vol. 3(1), pages 291-317, September.
    34. Joshua D. Angrist & Kathryn Graddy & Guido W. Imbens, 2000. "The Interpretation of Instrumental Variables Estimators in Simultaneous Equations Models with an Application to the Demand for Fish," The Review of Economic Studies, Review of Economic Studies Ltd, vol. 67(3), pages 499-527.
    35. Jeffrey M. Wooldridge, 2015. "Control Function Methods in Applied Econometrics," Journal of Human Resources, University of Wisconsin Press, vol. 50(2), pages 420-445.
    36. Bjorklund, Anders & Moffitt, Robert, 1987. "The Estimation of Wage Gains and Welfare Gains in Self-selection," The Review of Economics and Statistics, MIT Press, vol. 69(1), pages 42-49, February.
    37. Edward Vytlacil, 2002. "Independence, Monotonicity, and Latent Index Models: An Equivalence Result," Econometrica, Econometric Society, vol. 70(1), pages 331-341, January.
    Full references (including those not matched with items on IDEAS)

    Citations

    Citations are extracted by the CitEc Project, subscribe to its RSS feed for this item.
    as


    Cited by:

    1. Matias D. Cattaneo & Max H. Farrell & Michael Jansson & Ricardo Masini, 2022. "Higher-order Refinements of Small Bandwidth Asymptotics for Density-Weighted Average Derivative Estimators," Papers 2301.00277, arXiv.org, revised Feb 2024.
    2. Steven F. Lehrer & Tian Xie, 2022. "The Bigger Picture: Combining Econometrics with Analytics Improves Forecasts of Movie Success," Management Science, INFORMS, vol. 68(1), pages 189-210, January.
    3. Kyle Colangelo & Ying-Ying Lee, 2020. "Double Debiased Machine Learning Nonparametric Inference with Continuous Treatments," Papers 2004.03036, arXiv.org, revised Sep 2023.
    4. Kyle Colangelo & Ying-Ying Lee, 2019. "Double debiased machine learning nonparametric inference with continuous treatments," CeMMAP working papers CWP72/19, Centre for Microdata Methods and Practice, Institute for Fiscal Studies.
    5. Anatolyev, Stanislav, 2021. "Mallows criterion for heteroskedastic linear regressions with many regressors," Economics Letters, Elsevier, vol. 203(C).
    6. Aristide Houndetoungan & Abdoul Haki Maoude, 2024. "Inference for Two-Stage Extremum Estimators," Papers 2402.05030, arXiv.org.
    7. Mitchener, Kris & Richardson, Gary, 2020. "Contagion of Fear," CEPR Discussion Papers 14510, C.E.P.R. Discussion Papers.
    8. Kuanhao Jiang & Rajarshi Mukherjee & Subhabrata Sen & Pragya Sur, 2022. "A New Central Limit Theorem for the Augmented IPW Estimator: Variance Inflation, Cross-Fit Covariance and Beyond," Papers 2205.10198, arXiv.org, revised Oct 2022.
    9. Jinyong Hahn & Jerry Hausman, 2021. "Problems with the Control Variable Approach in Achieving Unbiased Estimates in Nonlinear Models in the Presence of Many Instruments," Journal of Quantitative Economics, Springer;The Indian Econometric Society (TIES), vol. 19(1), pages 39-58, December.
    10. Sebastian Calonico & Matias D. Cattaneo & Max H. Farrell, 2018. "Coverage Error Optimal Confidence Intervals for Local Polynomial Regression," Papers 1808.01398, arXiv.org, revised Jul 2021.
    11. Liang Jiang & Liyao Li & Ke Miao & Yichong Zhang, 2023. "Adjustment with Many Regressors Under Covariate-Adaptive Randomizations," Papers 2304.08184, arXiv.org, revised Feb 2024.
    12. Guo, Xu & Li, Runze & Liu, Jingyuan & Zeng, Mudong, 2023. "Statistical inference for linear mediation models with high-dimensional mediators and application to studying stock reaction to COVID-19 pandemic," Journal of Econometrics, Elsevier, vol. 235(1), pages 166-179.
    13. Jinyong Hahn & David W. Hughes & Guido Kuersteiner & Whitney K. Newey, 2022. "Efficient Bias Correction for Cross-section and Panel Data," Papers 2207.09943, arXiv.org, revised Jan 2024.
    14. Jochmans, Koen & Higgins, Ayden, 2022. "Bootstrap inference for fixed-effect models," TSE Working Papers 22-1328, Toulouse School of Economics (TSE), revised Dec 2023.
    15. Cattaneo, Matias D. & Jansson, Michael, 2022. "Average Density Estimators: Efficiency And Bootstrap Consistency," Econometric Theory, Cambridge University Press, vol. 38(6), pages 1140-1174, December.
    16. Su, Liangjun & Ura, Takuya & Zhang, Yichong, 2019. "Non-separable models with high-dimensional data," Journal of Econometrics, Elsevier, vol. 212(2), pages 646-677.
    17. Annalivia Polselli, 2023. "Robust Inference in Panel Data Models: Some Effects of Heteroskedasticity and Leveraged Data in Small Samples," Papers 2312.17676, arXiv.org.
    18. Yang Ning & Sida Peng & Jing Tao, 2020. "Doubly Robust Semiparametric Difference-in-Differences Estimators with High-Dimensional Data," Papers 2009.03151, arXiv.org.
    19. Naguib, Costanza, 2022. "Analytical bias correction for two-step fixed effects models with copula-distributed errors," Economics Letters, Elsevier, vol. 215(C).
    20. Gourieroux, Christian & Tiomo, Andre, 2019. "The Evaluation of Model Risk for Probability of Default and Expected Loss," MPRA Paper 95795, University Library of Munich, Germany.
    21. Yumou Qiu & Jing Tao & Xiao‐Hua Zhou, 2021. "Inference of heterogeneous treatment effects using observational data with high‐dimensional covariates," Journal of the Royal Statistical Society Series B, Royal Statistical Society, vol. 83(5), pages 1016-1043, November.
    22. Zhao, Puying & Haziza, David & Wu, Changbao, 2020. "Survey weighted estimating equation inference with nuisance functionals," Journal of Econometrics, Elsevier, vol. 216(2), pages 516-536.
    23. Harold D Chiang & Yukitoshi Matsushita & Taisuke Otsu, 2023. "Regression adjustment in randomized controlled trials with many covariates," Papers 2302.00469, arXiv.org, revised Nov 2023.
    24. Harold D Chiang & Yukitoshi Matsushita & Taisuke Otsu, 2023. "Regression adjustment in randomized controlled trials with many covariates," STICERD - Econometrics Paper Series 627, Suntory and Toyota International Centres for Economics and Related Disciplines, LSE.
    25. Xinwei Ma & Jingshen Wang, 2018. "Robust Inference Using Inverse Probability Weighting," Papers 1810.11397, arXiv.org, revised May 2019.

    Most related items

    These are the items that most often cite the same works as this one and are cited by the same works as this one.
    1. Farrell, Max H., 2015. "Robust inference on average treatment effects with possibly more covariates than observations," Journal of Econometrics, Elsevier, vol. 189(1), pages 1-23.
    2. Sant’Anna, Pedro H.C. & Zhao, Jun, 2020. "Doubly robust difference-in-differences estimators," Journal of Econometrics, Elsevier, vol. 219(1), pages 101-122.
    3. Guido W. Imbens & Jeffrey M. Wooldridge, 2009. "Recent Developments in the Econometrics of Program Evaluation," Journal of Economic Literature, American Economic Association, vol. 47(1), pages 5-86, March.
    4. Su, Liangjun & Ura, Takuya & Zhang, Yichong, 2019. "Non-separable models with high-dimensional data," Journal of Econometrics, Elsevier, vol. 212(2), pages 646-677.
    5. Victor Chernozhukov & Juan Carlos Escanciano & Hidehiko Ichimura & Whitney K. Newey & James M. Robins, 2022. "Locally Robust Semiparametric Estimation," Econometrica, Econometric Society, vol. 90(4), pages 1501-1535, July.
    6. Haitian Xie, 2020. "Efficient and Robust Estimation of the Generalized LATE Model," Papers 2001.06746, arXiv.org, revised Feb 2022.
    7. Alexandre Belloni & Victor Chernozhukov & Denis Chetverikov & Christian Hansen & Kengo Kato, 2018. "High-dimensional econometrics and regularized GMM," CeMMAP working papers CWP35/18, Centre for Microdata Methods and Practice, Institute for Fiscal Studies.
    8. Kyle Colangelo & Ying-Ying Lee, 2019. "Double debiased machine learning nonparametric inference with continuous treatments," CeMMAP working papers CWP72/19, Centre for Microdata Methods and Practice, Institute for Fiscal Studies.
    9. Kyle Colangelo & Ying-Ying Lee, 2019. "Double debiased machine learning nonparametric inference with continuous treatments," CeMMAP working papers CWP54/19, Centre for Microdata Methods and Practice, Institute for Fiscal Studies.
    10. Robert A. Moffitt & Matthew V. Zahn, 2019. "The Marginal Labor Supply Disincentives of Welfare: Evidence from Administrative Barriers to Participation," NBER Working Papers 26028, National Bureau of Economic Research, Inc.
    11. Pereda-Fernández, Santiago, 2023. "Identification and estimation of triangular models with a binary treatment," Journal of Econometrics, Elsevier, vol. 234(2), pages 585-623.
    12. Kyle Colangelo & Ying-Ying Lee, 2020. "Double Debiased Machine Learning Nonparametric Inference with Continuous Treatments," Papers 2004.03036, arXiv.org, revised Sep 2023.
    13. Huber, Martin, 2019. "An introduction to flexible methods for policy evaluation," FSES Working Papers 504, Faculty of Economics and Social Sciences, University of Freiburg/Fribourg Switzerland.
    14. Matias D. Cattaneo & Michael Jansson & Whitney K. Newey, 2018. "Inference in Linear Regression Models with Many Covariates and Heteroscedasticity," Journal of the American Statistical Association, Taylor & Francis Journals, vol. 113(523), pages 1350-1361, July.
    15. Dong, Chaohua & Gao, Jiti & Linton, Oliver, 2023. "High dimensional semiparametric moment restriction models," Journal of Econometrics, Elsevier, vol. 232(2), pages 320-345.
    16. Max H. Farrell, 2013. "Robust Inference on Average Treatment Effects with Possibly More Covariates than Observations," Papers 1309.4686, arXiv.org, revised Feb 2018.
    17. Ganesh Karapakula, 2023. "Stable Probability Weighting: Large-Sample and Finite-Sample Estimation and Inference Methods for Heterogeneous Causal Effects of Multivalued Treatments Under Limited Overlap," Papers 2301.05703, arXiv.org, revised Jan 2023.
    18. Sasaki, Yuya & Ura, Takuya, 2023. "Estimation and inference for policy relevant treatment effects," Journal of Econometrics, Elsevier, vol. 234(2), pages 394-450.
    19. Michael C. Knaus, 2021. "A double machine learning approach to estimate the effects of musical practice on student’s skills," Journal of the Royal Statistical Society Series A, Royal Statistical Society, vol. 184(1), pages 282-300, January.
    20. Chunrong Ai & Oliver Linton & Kaiji Motegi & Zheng Zhang, 2021. "A unified framework for efficient estimation of general treatment models," Quantitative Economics, Econometric Society, vol. 12(3), pages 779-816, July.

    More about this item

    Keywords

    Many covariates asymptotics; Robust inference; Bias Correction; Resampling Methods; M-estimation;
    All these keywords.

    JEL classification:

    • C12 - Mathematical and Quantitative Methods - - Econometric and Statistical Methods and Methodology: General - - - Hypothesis Testing: General
    • C13 - Mathematical and Quantitative Methods - - Econometric and Statistical Methods and Methodology: General - - - Estimation: General
    • C14 - Mathematical and Quantitative Methods - - Econometric and Statistical Methods and Methodology: General - - - Semiparametric and Nonparametric Methods: General
    • C21 - Mathematical and Quantitative Methods - - Single Equation Models; Single Variables - - - Cross-Sectional Models; Spatial Models; Treatment Effect Models

    Statistics

    Access and download statistics

    Corrections

    All material on this site has been provided by the respective publishers and authors. You can help correct errors and omissions. When requesting a correction, please mention this item's handle: RePEc:oup:restud:v:86:y:2019:i:3:p:1095-1122.. See general information about how to correct material in RePEc.

    If you have authored this item and are not yet registered with RePEc, we encourage you to do it here. This allows to link your profile to this item. It also allows you to accept potential citations to this item that we are uncertain about.

    If CitEc recognized a bibliographic reference but did not link an item in RePEc to it, you can help with this form .

    If you know of missing items citing this one, you can help us creating those links by adding the relevant references in the same way as above, for each refering item. If you are a registered author of this item, you may also want to check the "citations" tab in your RePEc Author Service profile, as there may be some citations waiting for confirmation.

    For technical questions regarding this item, or to correct its authors, title, abstract, bibliographic or download information, contact: Oxford University Press (email available below). General contact details of provider: https://academic.oup.com/restud .

    Please note that corrections may take a couple of weeks to filter through the various RePEc services.

    IDEAS is a RePEc service. RePEc uses bibliographic data supplied by the respective publishers.