IDEAS home Printed from https://ideas.repec.org/p/arx/papers/2205.10198.html
   My bibliography  Save this paper

A New Central Limit Theorem for the Augmented IPW Estimator: Variance Inflation, Cross-Fit Covariance and Beyond

Author

Listed:
  • Kuanhao Jiang
  • Rajarshi Mukherjee
  • Subhabrata Sen
  • Pragya Sur

Abstract

Estimation of the average treatment effect (ATE) is a central problem in causal inference. In recent times, inference for the ATE in the presence of high-dimensional covariates has been extensively studied. Among the diverse approaches that have been proposed, augmented inverse probability weighting (AIPW) with cross-fitting has emerged a popular choice in practice. In this work, we study this cross-fit AIPW estimator under well-specified outcome regression and propensity score models in a high-dimensional regime where the number of features and samples are both large and comparable. Under assumptions on the covariate distribution, we establish a new central limit theorem for the suitably scaled cross-fit AIPW that applies without any sparsity assumptions on the underlying high-dimensional parameters. Our CLT uncovers two crucial phenomena among others: (i) the AIPW exhibits a substantial variance inflation that can be precisely quantified in terms of the signal-to-noise ratio and other problem parameters, (ii) the asymptotic covariance between the pre-cross-fit estimators is non-negligible even on the root-n scale. These findings are strikingly different from their classical counterparts. On the technical front, our work utilizes a novel interplay between three distinct tools--approximate message passing theory, the theory of deterministic equivalents, and the leave-one-out approach. We believe our proof techniques should be useful for analyzing other two-stage estimators in this high-dimensional regime. Finally, we complement our theoretical results with simulations that demonstrate both the finite sample efficacy of our CLT and its robustness to our assumptions.

Suggested Citation

  • Kuanhao Jiang & Rajarshi Mukherjee & Subhabrata Sen & Pragya Sur, 2022. "A New Central Limit Theorem for the Augmented IPW Estimator: Variance Inflation, Cross-Fit Covariance and Beyond," Papers 2205.10198, arXiv.org, revised Oct 2022.
  • Handle: RePEc:arx:papers:2205.10198
    as

    Download full text from publisher

    File URL: http://arxiv.org/pdf/2205.10198
    File Function: Latest version
    Download Restriction: no
    ---><---

    References listed on IDEAS

    as
    1. Cattaneo, Matias D. & Jansson, Michael & Newey, Whitney K., 2018. "Alternative Asymptotics And The Partially Linear Model With Many Regressors," Econometric Theory, Cambridge University Press, vol. 34(2), pages 277-301, April.
    2. Matias D. Cattaneo & Michael Jansson & Whitney K. Newey, 2018. "Inference in Linear Regression Models with Many Covariates and Heteroscedasticity," Journal of the American Statistical Association, Taylor & Francis Journals, vol. 113(523), pages 1350-1361, July.
    3. Jelena Bradic & Stefan Wager & Yinchu Zhu, 2019. "Sparsity Double Robust Inference of Average Treatment Effects," Papers 1905.00744, arXiv.org.
    4. Hahn, Jinyong, 2002. "Optimal Inference With Many Instruments," Econometric Theory, Cambridge University Press, vol. 18(1), pages 140-168, February.
    5. Alberto Abadie & Guido W. Imbens, 2016. "Matching on the Estimated Propensity Score," Econometrica, Econometric Society, vol. 84, pages 781-807, March.
    6. Matias D Cattaneo & Michael Jansson & Xinwei Ma, 2019. "Two-Step Estimation and Inference with Possibly Many Included Covariates," The Review of Economic Studies, Review of Economic Studies Ltd, vol. 86(3), pages 1095-1122.
    7. Andrews,Donald W. K. & Stock,James H. (ed.), 2005. "Identification and Inference for Econometric Models," Cambridge Books, Cambridge University Press, number 9780521844413.
    8. Alberto Abadie & Guido W. Imbens, 2011. "Bias-Corrected Matching Estimators for Average Treatment Effects," Journal of Business & Economic Statistics, Taylor & Francis Journals, vol. 29(1), pages 1-11, January.
    9. Keisuke Hirano & Guido W. Imbens & Geert Ridder, 2003. "Efficient Estimation of Average Treatment Effects Using the Estimated Propensity Score," Econometrica, Econometric Society, vol. 71(4), pages 1161-1189, July.
    10. Farrell, Max H., 2015. "Robust inference on average treatment effects with possibly more covariates than observations," Journal of Econometrics, Elsevier, vol. 189(1), pages 1-23.
    11. Bekker, Paul A, 1994. "Alternative Approximations to the Distributions of Instrumental Variable Estimators," Econometrica, Econometric Society, vol. 62(3), pages 657-681, May.
    12. Susan Athey & Guido W. Imbens, 2019. "Machine Learning Methods That Economists Should Know About," Annual Review of Economics, Annual Reviews, vol. 11(1), pages 685-725, August.
    13. Kosuke Imai & Marc Ratkovic, 2014. "Covariate balancing propensity score," Journal of the Royal Statistical Society Series B, Royal Statistical Society, vol. 76(1), pages 243-263, January.
    14. Athey, Susan & Imbens, Guido W., 2019. "Machine Learning Methods Economists Should Know About," Research Papers 3776, Stanford University, Graduate School of Business.
    15. Z Tan, 2020. "Regularized calibrated estimation of propensity scores with model misspecification and high-dimensional data," Biometrika, Biometrika Trust, vol. 107(1), pages 137-158.
    16. Alexandre Belloni & Victor Chernozhukov & Christian Hansen, 2014. "Inference on Treatment Effects after Selection among High-Dimensional Controlsâ€," The Review of Economic Studies, Review of Economic Studies Ltd, vol. 81(2), pages 608-650.
    17. Susan Athey & Guido W. Imbens & Stefan Wager, 2018. "Approximate residual balancing: debiased inference of average treatment effects in high dimensions," Journal of the Royal Statistical Society Series B, Royal Statistical Society, vol. 80(4), pages 597-623, September.
    18. Yang Ning & Peng Sida & Kosuke Imai, 2020. "Robust estimation of causal effects via a high-dimensional covariate balancing propensity score," Biometrika, Biometrika Trust, vol. 107(3), pages 533-554.
    19. Jinyong Hahn, 1998. "On the Role of the Propensity Score in Efficient Semiparametric Estimation of Average Treatment Effects," Econometrica, Econometric Society, vol. 66(2), pages 315-332, March.
    20. Heejung Bang & James M. Robins, 2005. "Doubly Robust Estimation in Missing Data and Causal Inference Models," Biometrics, The International Biometric Society, vol. 61(4), pages 962-973, December.
    21. Stanislav Anatolyev, 2019. "Many Instruments And/Or Regressors: A Friendly Guide," Journal of Economic Surveys, Wiley Blackwell, vol. 33(2), pages 689-726, April.
    22. Tengyuan Liang & Pragya Sur, 2020. "A Precise High-Dimensional Asymptotic Theory for Boosting and Minimum-L1-Norm Interpolated Classifiers," Working Papers 2020-152, Becker Friedman Institute for Research In Economics.
    23. Lucas Janson & Rina Foygel Barber & Emmanuel Candès, 2017. "EigenPrism: inference for high dimensional signal-to-noise ratios," Journal of the Royal Statistical Society Series B, Royal Statistical Society, vol. 79(4), pages 1037-1065, September.
    24. Fan Li & Kari Lock Morgan & Alan M. Zaslavsky, 2018. "Balancing Covariates via Propensity Score Weighting," Journal of the American Statistical Association, Taylor & Francis Journals, vol. 113(521), pages 390-400, January.
    Full references (including those not matched with items on IDEAS)

    Most related items

    These are the items that most often cite the same works as this one and are cited by the same works as this one.
    1. Huber, Martin, 2019. "An introduction to flexible methods for policy evaluation," FSES Working Papers 504, Faculty of Economics and Social Sciences, University of Freiburg/Fribourg Switzerland.
    2. Michael Pollmann, 2020. "Causal Inference for Spatial Treatments," Papers 2011.00373, arXiv.org, revised Jan 2023.
    3. Kyle Colangelo & Ying-Ying Lee, 2019. "Double debiased machine learning nonparametric inference with continuous treatments," CeMMAP working papers CWP72/19, Centre for Microdata Methods and Practice, Institute for Fiscal Studies.
    4. Kyle Colangelo & Ying-Ying Lee, 2020. "Double Debiased Machine Learning Nonparametric Inference with Continuous Treatments," Papers 2004.03036, arXiv.org, revised Sep 2023.
    5. Ganesh Karapakula, 2023. "Stable Probability Weighting: Large-Sample and Finite-Sample Estimation and Inference Methods for Heterogeneous Causal Effects of Multivalued Treatments Under Limited Overlap," Papers 2301.05703, arXiv.org, revised Jan 2023.
    6. Kyle Colangelo & Ying-Ying Lee, 2019. "Double debiased machine learning nonparametric inference with continuous treatments," CeMMAP working papers CWP54/19, Centre for Microdata Methods and Practice, Institute for Fiscal Studies.
    7. Frölich, Markus & Huber, Martin & Wiesenfarth, Manuel, 2017. "The finite sample performance of semi- and non-parametric estimators for treatment effects and policy evaluation," Computational Statistics & Data Analysis, Elsevier, vol. 115(C), pages 91-102.
    8. Difang Huang & Jiti Gao & Tatsushi Oka, 2022. "Semiparametric Single-Index Estimation for Average Treatment Effects," Papers 2206.08503, arXiv.org, revised Oct 2022.
    9. Jelena Bradic & Stefan Wager & Yinchu Zhu, 2019. "Sparsity Double Robust Inference of Average Treatment Effects," Papers 1905.00744, arXiv.org.
    10. Shixiao Zhang & Peisong Han & Changbao Wu, 2023. "Calibration Techniques Encompassing Survey Sampling, Missing Data Analysis and Causal Inference," International Statistical Review, International Statistical Institute, vol. 91(2), pages 165-192, August.
    11. Su, Liangjun & Ura, Takuya & Zhang, Yichong, 2019. "Non-separable models with high-dimensional data," Journal of Econometrics, Elsevier, vol. 212(2), pages 646-677.
    12. Victor Chernozhukov & Juan Carlos Escanciano & Hidehiko Ichimura & Whitney K. Newey & James M. Robins, 2022. "Locally Robust Semiparametric Estimation," Econometrica, Econometric Society, vol. 90(4), pages 1501-1535, July.
    13. Zhexiao Lin & Fang Han, 2022. "On regression-adjusted imputation estimators of the average treatment effect," Papers 2212.05424, arXiv.org, revised Jan 2023.
    14. Joseph Antonelli & Georgia Papadogeorgou & Francesca Dominici, 2022. "Causal inference in high dimensions: A marriage between Bayesian modeling and good frequentist properties," Biometrics, The International Biometric Society, vol. 78(1), pages 100-114, March.
    15. Alexandre Belloni & Victor Chernozhukov & Denis Chetverikov & Christian Hansen & Kengo Kato, 2018. "High-dimensional econometrics and regularized GMM," CeMMAP working papers CWP35/18, Centre for Microdata Methods and Practice, Institute for Fiscal Studies.
    16. Michael C. Knaus, 2021. "A double machine learning approach to estimate the effects of musical practice on student’s skills," Journal of the Royal Statistical Society Series A, Royal Statistical Society, vol. 184(1), pages 282-300, January.
    17. Chunrong Ai & Oliver Linton & Kaiji Motegi & Zheng Zhang, 2021. "A unified framework for efficient estimation of general treatment models," Quantitative Economics, Econometric Society, vol. 12(3), pages 779-816, July.
    18. Michael C Knaus, 2022. "Double machine learning-based programme evaluation under unconfoundedness [Econometric methods for program evaluation]," The Econometrics Journal, Royal Economic Society, vol. 25(3), pages 602-627.
    19. Matias D Cattaneo & Michael Jansson & Xinwei Ma, 2019. "Two-Step Estimation and Inference with Possibly Many Included Covariates," The Review of Economic Studies, Review of Economic Studies Ltd, vol. 86(3), pages 1095-1122.
    20. Kevin P. Josey & Elizabeth Juarez‐Colunga & Fan Yang & Debashis Ghosh, 2021. "A framework for covariate balance using Bregman distances," Scandinavian Journal of Statistics, Danish Society for Theoretical Statistics;Finnish Statistical Society;Norwegian Statistical Association;Swedish Statistical Association, vol. 48(3), pages 790-816, September.

    More about this item

    NEP fields

    This paper has been announced in the following NEP Reports:

    Statistics

    Access and download statistics

    Corrections

    All material on this site has been provided by the respective publishers and authors. You can help correct errors and omissions. When requesting a correction, please mention this item's handle: RePEc:arx:papers:2205.10198. See general information about how to correct material in RePEc.

    If you have authored this item and are not yet registered with RePEc, we encourage you to do it here. This allows to link your profile to this item. It also allows you to accept potential citations to this item that we are uncertain about.

    If CitEc recognized a bibliographic reference but did not link an item in RePEc to it, you can help with this form .

    If you know of missing items citing this one, you can help us creating those links by adding the relevant references in the same way as above, for each refering item. If you are a registered author of this item, you may also want to check the "citations" tab in your RePEc Author Service profile, as there may be some citations waiting for confirmation.

    For technical questions regarding this item, or to correct its authors, title, abstract, bibliographic or download information, contact: arXiv administrators (email available below). General contact details of provider: http://arxiv.org/ .

    Please note that corrections may take a couple of weeks to filter through the various RePEc services.

    IDEAS is a RePEc service. RePEc uses bibliographic data supplied by the respective publishers.