IDEAS home Printed from https://ideas.repec.org/p/arx/papers/1807.10100.html
   My bibliography  Save this paper

Two-Step Estimation and Inference with Possibly Many Included Covariates

Author

Listed:
  • Matias D. Cattaneo
  • Michael Jansson
  • Xinwei Ma

Abstract

We study the implications of including many covariates in a first-step estimate entering a two-step estimation procedure. We find that a first order bias emerges when the number of \textit{included} covariates is "large" relative to the square-root of sample size, rendering standard inference procedures invalid. We show that the jackknife is able to estimate this "many covariates" bias consistently, thereby delivering a new automatic bias-corrected two-step point estimator. The jackknife also consistently estimates the standard error of the original two-step point estimator. For inference, we develop a valid post-bias-correction bootstrap approximation that accounts for the additional variability introduced by the jackknife bias-correction. We find that the jackknife bias-corrected point estimator and the bootstrap post-bias-correction inference perform excellent in simulations, offering important improvements over conventional two-step point estimators and inference procedures, which are not robust to including many covariates. We apply our results to an array of distinct treatment effect, policy evaluation, and other applied microeconomics settings. In particular, we discuss production function and marginal treatment effect estimation in detail.

Suggested Citation

  • Matias D. Cattaneo & Michael Jansson & Xinwei Ma, 2018. "Two-Step Estimation and Inference with Possibly Many Included Covariates," Papers 1807.10100, arXiv.org.
  • Handle: RePEc:arx:papers:1807.10100
    as

    Download full text from publisher

    File URL: http://arxiv.org/pdf/1807.10100
    File Function: Latest version
    Download Restriction: no
    ---><---

    Other versions of this item:

    References listed on IDEAS

    as
    1. Iván Fernández-Val & Martin Weidner, 2018. "Fixed Effects Estimation of Large-TPanel Data Models," Annual Review of Economics, Annual Reviews, vol. 10(1), pages 109-138, August.
    2. repec:clg:wpaper:2013-20 is not listed on IDEAS
    3. Jinyong Hahn & Geert Ridder, 2013. "Asymptotic Variance of Semiparametric Estimators With Generated Regressors," Econometrica, Econometric Society, vol. 81(1), pages 315-340, January.
    4. Belloni, Alexandre & Chernozhukov, Victor & Chetverikov, Denis & Kato, Kengo, 2015. "Some new asymptotic theory for least squares series: Pointwise and uniform results," Journal of Econometrics, Elsevier, vol. 186(2), pages 345-366.
    5. Matthew D. Webb, 2023. "Reworking wild bootstrap‐based inference for clustered errors," Canadian Journal of Economics/Revue canadienne d'économique, John Wiley & Sons, vol. 56(3), pages 839-858, August.
    6. Newey, Whitney K, 1994. "The Asymptotic Variance of Semiparametric Estimators," Econometrica, Econometric Society, vol. 62(6), pages 1349-1382, November.
    7. Kline, Patrick & Santos, Andres, 2012. "Higher order properties of the wild bootstrap under misspecification," Journal of Econometrics, Elsevier, vol. 171(1), pages 54-70.
    8. Jeffrey M Wooldridge, 2010. "Econometric Analysis of Cross Section and Panel Data," MIT Press Books, The MIT Press, edition 2, volume 1, number 0262232588, December.
    9. Cattaneo, Matias D. & Crump, Richard K. & Jansson, Michael, 2010. "Robust Data-Driven Inference for Density-Weighted Average Derivatives," Journal of the American Statistical Association, American Statistical Association, vol. 105(491), pages 1070-1083.
    10. Edward Vytlacil, 2002. "Independence, Monotonicity, and Latent Index Models: An Equivalence Result," Econometrica, Econometric Society, vol. 70(1), pages 331-341, January.
    Full references (including those not matched with items on IDEAS)

    Most related items

    These are the items that most often cite the same works as this one and are cited by the same works as this one.
    1. repec:cdl:ucsdec:qt86c7x315 is not listed on IDEAS
    2. repec:cdl:econwp:qt86c7x315 is not listed on IDEAS
    3. Hidehiko Ichimura & Whitney K. Newey, 2022. "The influence function of semiparametric estimators," Quantitative Economics, Econometric Society, vol. 13(1), pages 29-61, January.
    4. Elia Lapenta, 2022. "A Bootstrap Specification Test for Semiparametric Models with Generated Regressors," Papers 2212.11112, arXiv.org, revised Oct 2023.
    5. Mogstad, Magne & Torgovitsky, Alexander, 2024. "Instrumental variables with unobserved heterogeneity in treatment effects," Handbook of Labor Economics,, Elsevier.
    6. Yanchun Jin, 2016. "Nonparametric tests for the effect of treatment on conditional variance," KIER Working Papers 948, Kyoto University, Institute of Economic Research.
    7. Juan Carlos Escanciano & Telmo P'erez-Izquierdo, 2023. "Automatic Debiased Estimation with Machine Learning-Generated Regressors," Papers 2301.10643, arXiv.org, revised May 2025.
    8. Michal Kolesár, 2013. "Estimation in an Instrumental Variables Model With Treatment Effect Heterogeneity," Working Papers 2013-2, Princeton University. Economics Department..
    9. Breunig, Christoph & Mammen, Enno & Simoni, Anna, 2018. "Nonparametric estimation in case of endogenous selection," Journal of Econometrics, Elsevier, vol. 202(2), pages 268-285.
    10. Alexandre Belloni & Victor Chernozhukov & Ivan Fernandez-Val & Christian Hansen, 2013. "Program evaluation with high-dimensional data," CeMMAP working papers CWP77/13, Centre for Microdata Methods and Practice, Institute for Fiscal Studies.
    11. Woutersen, Tiemen & Hausman, Jerry A., 2019. "Increasing the power of specification tests," Journal of Econometrics, Elsevier, vol. 211(1), pages 166-175.
    12. Yukitoshi Matsushita & Taisuke Otsu, 2017. "Likelihood inference on semiparametric models: Average derivative and treatment effect," STICERD - Econometrics Paper Series 592, Suntory and Toyota International Centres for Economics and Related Disciplines, LSE.
    13. Mammen, Enno & Rothe, Christoph & Schienle, Melanie, 2016. "Semiparametric Estimation With Generated Covariates," Econometric Theory, Cambridge University Press, vol. 32(5), pages 1140-1177, October.
    14. Elisa Gerten & Michael Beckmann & Elisa Gerten & Matthias Kräkel, 2022. "Information and Communication Technology, Hierarchy, and Job Design," ECONtribute Discussion Papers Series 189, University of Bonn and University of Cologne, Germany.
    15. Patrick Kline & Christopher R. Walters, 2019. "On Heckits, LATE, and Numerical Equivalence," Econometrica, Econometric Society, vol. 87(2), pages 677-696, March.
    16. Yugang He, 2024. "E-commerce and foreign direct investment: pioneering a new era of trade strategies," Humanities and Social Sciences Communications, Palgrave Macmillan, vol. 11(1), pages 1-14, December.
    17. Pereda-Fernández, Santiago, 2023. "Identification and estimation of triangular models with a binary treatment," Journal of Econometrics, Elsevier, vol. 234(2), pages 585-623.
    18. Martinez-Iriarte, Julian & Sun, Yixiao, 2024. "Identification and estimation of unconditional policy effects of an endogenous binary treatment: An unconditional MTE approach," Journal of Econometrics, Elsevier, vol. 244(1).
    19. C de Chaisemartin & X D’HaultfŒuille, 2018. "Fuzzy Differences-in-Differences," The Review of Economic Studies, Review of Economic Studies Ltd, vol. 85(2), pages 999-1028.
    20. Ying-Ying Lee, 2014. "Partial Mean Processes with Generated Regressors: Continuous Treatment Effects and Nonseparable Models," Economics Series Working Papers 706, University of Oxford, Department of Economics.
    21. Matthias Westphal & Daniel A Kamhöfer & Hendrik Schmitz, 2022. "Marginal College Wage Premiums Under Selection Into Employment," The Economic Journal, Royal Economic Society, vol. 132(646), pages 2231-2272.
    22. Belloni, Alexandre & Chernozhukov, Victor & Chetverikov, Denis & Fernández-Val, Iván, 2019. "Conditional quantile processes based on series or many regressors," Journal of Econometrics, Elsevier, vol. 213(1), pages 4-29.

    More about this item

    JEL classification:

    • C12 - Mathematical and Quantitative Methods - - Econometric and Statistical Methods and Methodology: General - - - Hypothesis Testing: General
    • C13 - Mathematical and Quantitative Methods - - Econometric and Statistical Methods and Methodology: General - - - Estimation: General
    • C14 - Mathematical and Quantitative Methods - - Econometric and Statistical Methods and Methodology: General - - - Semiparametric and Nonparametric Methods: General
    • C21 - Mathematical and Quantitative Methods - - Single Equation Models; Single Variables - - - Cross-Sectional Models; Spatial Models; Treatment Effect Models

    NEP fields

    This paper has been announced in the following NEP Reports:

    Statistics

    Access and download statistics

    Corrections

    All material on this site has been provided by the respective publishers and authors. You can help correct errors and omissions. When requesting a correction, please mention this item's handle: RePEc:arx:papers:1807.10100. See general information about how to correct material in RePEc.

    If you have authored this item and are not yet registered with RePEc, we encourage you to do it here. This allows to link your profile to this item. It also allows you to accept potential citations to this item that we are uncertain about.

    If CitEc recognized a bibliographic reference but did not link an item in RePEc to it, you can help with this form .

    If you know of missing items citing this one, you can help us creating those links by adding the relevant references in the same way as above, for each refering item. If you are a registered author of this item, you may also want to check the "citations" tab in your RePEc Author Service profile, as there may be some citations waiting for confirmation.

    For technical questions regarding this item, or to correct its authors, title, abstract, bibliographic or download information, contact: arXiv administrators (email available below). General contact details of provider: http://arxiv.org/ .

    Please note that corrections may take a couple of weeks to filter through the various RePEc services.

    IDEAS is a RePEc service. RePEc uses bibliographic data supplied by the respective publishers.