IDEAS home Printed from https://ideas.repec.org/p/arx/papers/2302.00469.html
   My bibliography  Save this paper

Regression Adjustment, Cross-Fitting, and Randomized Experiments with Many Controls

Author

Listed:
  • Harold D Chiang
  • Yukitoshi Matsushita
  • Taisuke Otsu

Abstract

This paper studies estimation and inference for average treatment effects in randomized experiments with many covariates, under a design-based framework with a deterministic number of treated units. We show that a simple yet powerful cross-fitted regression adjustment achieves bias-correction and leads to sharper asymptotic properties than existing alternatives. Specifically, we derive higher-order stochastic expansions, analyze associated inference procedures, and propose a modified HC3 variance estimator that accounts for up to second-order. Our analysis reveals that cross-fitting permits substantially faster growth in the covariate dimension $p$ relative to sample size $n$, with asymptotic normality holding under favorable designs when $p = o(n^{3/4}/(\log n)^{1/2})$, improving on standard rates. We also explain and address the poor size performance of conventional variance estimators. The methodology extends naturally to stratified experiments with many strata. Simulations confirm that the cross-fitted estimator, combined with the modified HC3, delivers accurate estimation and reliable inference across diverse designs.

Suggested Citation

  • Harold D Chiang & Yukitoshi Matsushita & Taisuke Otsu, 2023. "Regression Adjustment, Cross-Fitting, and Randomized Experiments with Many Controls," Papers 2302.00469, arXiv.org, revised May 2025.
  • Handle: RePEc:arx:papers:2302.00469
    as

    Download full text from publisher

    File URL: http://arxiv.org/pdf/2302.00469
    File Function: Latest version
    Download Restriction: no
    ---><---

    References listed on IDEAS

    as
    1. Tirthankar Dasgupta & Natesh S. Pillai & Donald B. Rubin, 2015. "Causal inference from 2-super-K factorial designs by using potential outcomes," Journal of the Royal Statistical Society Series B, Royal Statistical Society, vol. 77(4), pages 727-753, September.
    2. Matias D Cattaneo & Michael Jansson & Xinwei Ma, 2019. "Two-Step Estimation and Inference with Possibly Many Included Covariates," The Review of Economic Studies, Review of Economic Studies Ltd, vol. 86(3), pages 1095-1122.
    3. Philip Oreopoulos & Daniel Lang & Joshua Angrist, 2009. "Incentives and Services for College Achievement: Evidence from a Randomized Trial," American Economic Journal: Applied Economics, American Economic Association, vol. 1(1), pages 136-163, January.
    4. Lihua Lei & Peng Ding, 2021. "Regression adjustment in completely randomized experiments with a diverging number of covariates [Covariance adjustments for the analysis of randomized field experiments]," Biometrika, Biometrika Trust, vol. 108(4), pages 815-828.
    5. Jelena Bradic & Stefan Wager & Yinchu Zhu, 2019. "Sparsity Double Robust Inference of Average Treatment Effects," Papers 1905.00744, arXiv.org.
    6. Haoge Chang & Joel Middleton & P. M. Aronow, 2021. "Exact Bias Correction for Linear Adjustment of Randomized Controlled Trials," Papers 2110.08425, arXiv.org, revised Oct 2021.
    7. Xinran Li & Peng Ding, 2020. "Rerandomization and regression adjustment," Journal of the Royal Statistical Society Series B, Royal Statistical Society, vol. 82(1), pages 241-268, February.
    8. Colin B Fogarty, 2018. "Regression-assisted inference for the average treatment effect in paired experiments," Biometrika, Biometrika Trust, vol. 105(4), pages 994-1000.
    9. Imbens,Guido W. & Rubin,Donald B., 2015. "Causal Inference for Statistics, Social, and Biomedical Sciences," Cambridge Books, Cambridge University Press, number 9780521885881, June.
    Full references (including those not matched with items on IDEAS)

    Citations

    Citations are extracted by the CitEc Project, subscribe to its RSS feed for this item.
    as


    Cited by:

    1. Undral Byambadalai & Tatsushi Oka & Shota Yasui, 2024. "Estimating Distributional Treatment Effects in Randomized Experiments: Machine Learning for Variance Reduction," Papers 2407.16037, arXiv.org.
    2. Yuehao Bai & Azeem M. Shaikh & Max Tabord-Meehan, 2024. "A Primer on the Analysis of Randomized Experiments and a Survey of some Recent Advances," Papers 2405.03910, arXiv.org, revised Apr 2025.
    3. Liang Jiang & Liyao Li & Ke Miao & Yichong Zhang, 2023. "Adjustment with Many Regressors Under Covariate-Adaptive Randomizations," Papers 2304.08184, arXiv.org, revised Feb 2025.

    Most related items

    These are the items that most often cite the same works as this one and are cited by the same works as this one.
    1. Harold D Chiang & Yukitoshi Matsushita & Taisuke Otsu, 2023. "Regression adjustment in randomized controlled trials with many covariates," STICERD - Econometrics Paper Series 627, Suntory and Toyota International Centres for Economics and Related Disciplines, LSE.
    2. Jiang, Liang & Phillips, Peter C.B. & Tao, Yubo & Zhang, Yichong, 2023. "Regression-adjusted estimation of quantile treatment effects under covariate-adaptive randomizations," Journal of Econometrics, Elsevier, vol. 234(2), pages 758-776.
    3. Fangzhou Su & Peng Ding, 2021. "Model‐assisted analyses of cluster‐randomized experiments," Journal of the Royal Statistical Society Series B, Royal Statistical Society, vol. 83(5), pages 994-1015, November.
    4. Liang Jiang & Liyao Li & Ke Miao & Yichong Zhang, 2023. "Adjustment with Many Regressors Under Covariate-Adaptive Randomizations," Papers 2304.08184, arXiv.org, revised Feb 2025.
    5. Liang Jiang & Oliver B. Linton & Haihan Tang & Yichong Zhang, 2022. "Improving Estimation Efficiency via Regression-Adjustment in Covariate-Adaptive Randomizations with Imperfect Compliance," Papers 2201.13004, arXiv.org, revised Jun 2023.
    6. Zhao, Anqi & Ding, Peng, 2024. "No star is good news: A unified look at rerandomization based on p-values from covariate balance tests," Journal of Econometrics, Elsevier, vol. 241(1).
    7. Ke Zhu & Hanzhong Liu, 2023. "Pair‐switching rerandomization," Biometrics, The International Biometric Society, vol. 79(3), pages 2127-2142, September.
    8. Zhao, Anqi & Ding, Peng, 2021. "Covariate-adjusted Fisher randomization tests for the average treatment effect," Journal of Econometrics, Elsevier, vol. 225(2), pages 278-294.
    9. Haoge Chang, 2023. "Design-based Estimation Theory for Complex Experiments," Papers 2311.06891, arXiv.org, revised May 2025.
    10. Edward Wu & Johann A. Gagnon-Bartsch, 2021. "Design-Based Covariate Adjustments in Paired Experiments," Journal of Educational and Behavioral Statistics, , vol. 46(1), pages 109-132, February.
    11. Lihua Lei, 2024. "Causal Interpretation of Regressions With Ranks," Papers 2406.05548, arXiv.org.
    12. Yumou Qiu & Jing Tao & Xiao‐Hua Zhou, 2021. "Inference of heterogeneous treatment effects using observational data with high‐dimensional covariates," Journal of the Royal Statistical Society Series B, Royal Statistical Society, vol. 83(5), pages 1016-1043, November.
    13. Peter Z. Schochet, 2018. "Design-Based Estimators for Average Treatment Effects for Multi-Armed RCTs," Journal of Educational and Behavioral Statistics, , vol. 43(5), pages 568-593, October.
    14. Lu, Jiannan, 2016. "On randomization-based and regression-based inferences for 2K factorial designs," Statistics & Probability Letters, Elsevier, vol. 112(C), pages 72-78.
    15. Xiduo Chen & Xingdong Feng & Antonio F. Galvao & Yeheng Ge, 2025. "Treatment Effects Inference with High-Dimensional Instruments and Control Variables," Papers 2503.20149, arXiv.org.
    16. Alqallaf, Fatemah A. & Huda, S. & Mukerjee, Rahul, 2019. "Causal inference from strip-plot designs in a potential outcomes framework," Statistics & Probability Letters, Elsevier, vol. 149(C), pages 55-62.
    17. Yuehao Bai & Jizhou Liu & Max Tabord-Meehan, 2022. "Inference for Matched Tuples and Fully Blocked Factorial Designs," Papers 2206.04157, arXiv.org, revised Nov 2023.
    18. Andre Rossi Oliveira, 2024. "Evaluating the Short-term Causal Effect of Early Alert on Student Performance," Research in Higher Education, Springer;Association for Institutional Research, vol. 65(7), pages 1395-1419, November.
    19. Peter L. Cohen & Colin B. Fogarty, 2022. "Gaussian prepivoting for finite population causal inference," Journal of the Royal Statistical Society Series B, Royal Statistical Society, vol. 84(2), pages 295-320, April.
    20. Davide Viviano, 2019. "Policy Targeting under Network Interference," Papers 1906.10258, arXiv.org, revised Apr 2024.

    More about this item

    NEP fields

    This paper has been announced in the following NEP Reports:

    Statistics

    Access and download statistics

    Corrections

    All material on this site has been provided by the respective publishers and authors. You can help correct errors and omissions. When requesting a correction, please mention this item's handle: RePEc:arx:papers:2302.00469. See general information about how to correct material in RePEc.

    If you have authored this item and are not yet registered with RePEc, we encourage you to do it here. This allows to link your profile to this item. It also allows you to accept potential citations to this item that we are uncertain about.

    If CitEc recognized a bibliographic reference but did not link an item in RePEc to it, you can help with this form .

    If you know of missing items citing this one, you can help us creating those links by adding the relevant references in the same way as above, for each refering item. If you are a registered author of this item, you may also want to check the "citations" tab in your RePEc Author Service profile, as there may be some citations waiting for confirmation.

    For technical questions regarding this item, or to correct its authors, title, abstract, bibliographic or download information, contact: arXiv administrators (email available below). General contact details of provider: http://arxiv.org/ .

    Please note that corrections may take a couple of weeks to filter through the various RePEc services.

    IDEAS is a RePEc service. RePEc uses bibliographic data supplied by the respective publishers.