IDEAS home Printed from https://ideas.repec.org/p/arx/papers/2511.04957.html

Training and Testing with Multiple Splits: A Central Limit Theorem for Split-Sample Estimators

Author

Listed:
  • Bruno Fava

Abstract

As predictive algorithms grow in popularity, using the same dataset to both train and test a new model has become routine across research, policy, and industry. Sample-splitting attains valid inference on model properties by using separate subsamples to estimate the model and to evaluate it. However, this approach has two drawbacks, since each task uses only part of the data, and different splits can lead to widely different estimates. Averaging across multiple splits, I develop an inference approach that uses more data for training, uses the entire sample for testing, and improves reproducibility. I address the statistical dependence from reusing observations across splits by proving a new central limit theorem for a large class of split-sample estimators under arguably mild and general conditions. Importantly, I make no restrictions on model complexity or convergence rates. I show that confidence intervals based on the normal approximation are valid for many applications, but may undercover in important cases of interest, such as comparing the performance between two models. I develop a new inference approach for such cases, explicitly accounting for the dependence across splits. Moreover, I provide a measure of reproducibility for p-values obtained from split-sample estimators. Finally, I apply my results to two important problems in development and public economics: predicting poverty and learning heterogeneous treatment effects in randomized experiments. I show that my inference approach with repeated cross-fitting achieves better power than existing alternatives, often enough to reveal statistical significance that would otherwise be missed.

Suggested Citation

  • Bruno Fava, 2025. "Training and Testing with Multiple Splits: A Central Limit Theorem for Split-Sample Estimators," Papers 2511.04957, arXiv.org, revised Nov 2025.
  • Handle: RePEc:arx:papers:2511.04957
    as

    Download full text from publisher

    File URL: http://arxiv.org/pdf/2511.04957
    File Function: Latest version
    Download Restriction: no
    ---><---

    References listed on IDEAS

    as
    1. Meinshausen, Nicolai & Meier, Lukas & Bühlmann, Peter, 2009. "p-Values for High-Dimensional Regression," Journal of the American Statistical Association, American Statistical Association, vol. 104(488), pages 1671-1681.
    2. Stefan Wager & Susan Athey, 2018. "Estimation and Inference of Heterogeneous Treatment Effects using Random Forests," Journal of the American Statistical Association, Taylor & Francis Journals, vol. 113(523), pages 1228-1242, July.
    3. Bruno Fava, 2024. "Predicting the Distribution of Treatment Effects via Covariate-Adjustment, with an Application to Microcredit," Papers 2407.14635, arXiv.org, revised Jul 2025.
    4. Victor Chernozhukov & Han Hong & Elie Tamer, 2007. "Estimation and Confidence Regions for Parameter Sets in Econometric Models," Econometrica, Econometric Society, vol. 75(5), pages 1243-1284, September.
    5. Wenlong Ji & Lihua Lei & Asher Spector, 2023. "Model-Agnostic Covariate-Assisted Inference on Partially Identified Causal Effects," Papers 2310.08115, arXiv.org, revised Nov 2024.
    6. Toru Kitagawa & Aleksey Tetenov, 2018. "Who Should Be Treated? Empirical Welfare Maximization Methods for Treatment Choice," Econometrica, Econometric Society, vol. 86(2), pages 591-616, March.
    7. Stephen Bates & Trevor Hastie & Robert Tibshirani, 2024. "Cross-Validation: What Does It Estimate and How Well Does It Do It?," Journal of the American Statistical Association, Taylor & Francis Journals, vol. 119(546), pages 1434-1445, April.
    8. Gharad Bryan & Dean Karlan & Adam Osman, 2024. "Big Loans to Small Businesses: Predicting Winners and Losers in an Entrepreneurial Lending Experiment," American Economic Review, American Economic Association, vol. 114(9), pages 2825-2860, September.
    9. Kaspar Wüthrich & Ying Zhu, 2023. "Omitted Variable Bias of Lasso-Based Inference Methods: A Finite Sample Analysis," The Review of Economics and Statistics, MIT Press, vol. 105(4), pages 982-997, July.
    10. Annie Liang & Jay Lu & Xiaosheng Mu & Kyohei Okumura, 2021. "Algorithm Design: A Fairness-Accuracy Frontier," Papers 2112.09975, arXiv.org, revised May 2024.
    11. Wüthrich, Kaspar & Zhu, Ying, 2023. "Omitted Variable Bias of Lasso-Based Inference Methods: A Finite Sample Analysis," University of California at San Diego, Economics Working Paper Series qt1gp6g9gm, Department of Economics, UC San Diego.
    12. David Benkeser & Maya Petersen & Mark J. van der Laan, 2020. "Improved Small-Sample Estimation of Nonlinear Cross-Validated Prediction Metrics," Journal of the American Statistical Association, Taylor & Francis Journals, vol. 115(532), pages 1917-1932, December.
    13. Takanori Ida & Takunori Ishihara & Koichiro Ito & Daido Kido & Toru Kitagawa & Shosei Sakaguchi & Shusaku Sasaki, 2024. "Dynamic Targeting: Experimental Evidence from Energy Rebate Programs," NBER Working Papers 32561, National Bureau of Economic Research, Inc.
    14. Athey, Susan & Keleher, Niall & Spiess, Jann, 2025. "Machine learning who to nudge: Causal vs predictive targeting in a field experiment on student financial aid renewal," Journal of Econometrics, Elsevier, vol. 249(PC).
    15. Vaart,A. W. van der, 2000. "Asymptotic Statistics," Cambridge Books, Cambridge University Press, number 9780521784504, January.
    16. Victor Chernozhukov & Denis Chetverikov & Mert Demirer & Esther Duflo & Christian Hansen & Whitney Newey & James Robins, 2018. "Double/debiased machine learning for treatment and structural parameters," Econometrics Journal, Royal Economic Society, vol. 21(1), pages 1-68, February.
    17. A. Belloni & V. Chernozhukov & I. Fernández‐Val & C. Hansen, 2017. "Program Evaluation and Causal Inference With High‐Dimensional Data," Econometrica, Econometric Society, vol. 85, pages 233-298, January.
    18. Andrews, Donald W.K. & Guggenberger, Patrik, 2009. "Validity Of Subsampling And “Plug-In Asymptotic” Inference For Parameters Defined By Moment Inequalities," Econometric Theory, Cambridge University Press, vol. 25(3), pages 669-709, June.
    19. Shi, Xiaoxia, 2015. "Model selection tests for moment inequality models," Journal of Econometrics, Elsevier, vol. 187(1), pages 1-17.
    20. Amilcar Velez, 2024. "On the Asymptotic Properties of Debiased Machine Learning Estimators," Papers 2411.01864, arXiv.org.
    21. Vira Semenova, 2023. "Debiased Machine Learning of Aggregated Intersection Bounds and Other Causal Parameters," Papers 2303.00982, arXiv.org, revised May 2025.
    22. Kosuke Imai & Michael Lingzhi Li, 2025. "Statistical Inference for Heterogeneous Treatment Effects Discovered by Generic Machine Learning in Randomized Experiments," Journal of Business & Economic Statistics, Taylor & Francis Journals, vol. 43(1), pages 256-268, January.
    Full references (including those not matched with items on IDEAS)

    Most related items

    These are the items that most often cite the same works as this one and are cited by the same works as this one.
    1. Kyle Colangelo & Ying-Ying Lee, 2019. "Double debiased machine learning nonparametric inference with continuous treatments," CeMMAP working papers CWP72/19, Centre for Microdata Methods and Practice, Institute for Fiscal Studies.
    2. Kyle Colangelo & Ying-Ying Lee, 2019. "Double debiased machine learning nonparametric inference with continuous treatments," CeMMAP working papers CWP54/19, Centre for Microdata Methods and Practice, Institute for Fiscal Studies.
    3. Semenova, Vira, 2023. "Debiased machine learning of set-identified linear models," Journal of Econometrics, Elsevier, vol. 235(2), pages 1725-1746.
    4. Kyle Colangelo & Ying-Ying Lee, 2020. "Double Debiased Machine Learning Nonparametric Inference with Continuous Treatments," Papers 2004.03036, arXiv.org, revised Sep 2023.
    5. Michael C Knaus, 2022. "Double machine learning-based programme evaluation under unconfoundedness [Econometric methods for program evaluation]," The Econometrics Journal, Royal Economic Society, vol. 25(3), pages 602-627.
    6. Ganesh Karapakula, 2023. "Stable Probability Weighting: Large-Sample and Finite-Sample Estimation and Inference Methods for Heterogeneous Causal Effects of Multivalued Treatments Under Limited Overlap," Papers 2301.05703, arXiv.org, revised Jan 2023.
    7. Nathan Kallus, 2022. "Treatment Effect Risk: Bounds and Inference," Papers 2201.05893, arXiv.org, revised Jul 2022.
    8. Semenova, Vira, 2025. "Generalized Lee bounds," Journal of Econometrics, Elsevier, vol. 251(C).
    9. Nathan Kallus, 2022. "What's the Harm? Sharp Bounds on the Fraction Negatively Affected by Treatment," Papers 2205.10327, arXiv.org, revised Nov 2022.
    10. Yanqin Fan & Yuan Qi & Gaoqian Xu, 2025. "Policy Learning with $\alpha$-Expected Welfare," Papers 2505.00256, arXiv.org.
    11. Alexandre Belloni & Victor Chernozhukov & Denis Chetverikov & Christian Hansen & Kengo Kato, 2018. "High-Dimensional Econometrics and Regularized GMM," Papers 1806.01888, arXiv.org, revised Jun 2018.
    12. Justin Whitehouse & Morgane Austern & Vasilis Syrgkanis, 2025. "Inference on Optimal Policy Values and Other Irregular Functionals via Smoothing," Papers 2507.11780, arXiv.org.
    13. Davide Viviano & Jelena Bradic, 2019. "Synthetic learner: model-free inference on treatments over time," Papers 1904.01490, arXiv.org, revised Aug 2022.
    14. A Stefano Caria & Grant Gordon & Maximilian Kasy & Simon Quinn & Soha Osman Shami & Alexander Teytelboym, 2024. "An Adaptive Targeted Field Experiment: Job Search Assistance for Refugees in Jordan," Journal of the European Economic Association, European Economic Association, vol. 22(2), pages 781-836.
    15. Yuan Liao & Anna Simoni, 2012. "Semi-parametric Bayesian Partially Identified Models based on Support Function," Papers 1212.3267, arXiv.org, revised Nov 2013.
    16. Nathan Kallus & Miruna Oprescu, 2022. "Robust and Agnostic Learning of Conditional Distributional Treatment Effects," Papers 2205.11486, arXiv.org, revised Jun 2025.
    17. Nan Liu & Yanbo Liu & Yuya Sasaki & Yuanyuan Wan, 2025. "Nonparametric Uniform Inference in Binary Classification and Policy Values," Papers 2511.14700, arXiv.org, revised Dec 2025.
    18. Agboola, Oluwagbenga David & Yu, Han, 2023. "Neighborhood-based cross fitting approach to treatment effects with high-dimensional data," Computational Statistics & Data Analysis, Elsevier, vol. 186(C).
    19. Xiaohong Chen & Timothy M. Christensen & Elie Tamer, 2018. "Monte Carlo Confidence Sets for Identified Sets," Econometrica, Econometric Society, vol. 86(6), pages 1965-2018, November.
    20. Guido W. Imbens, 2020. "Potential Outcome and Directed Acyclic Graph Approaches to Causality: Relevance for Empirical Practice in Economics," Journal of Economic Literature, American Economic Association, vol. 58(4), pages 1129-1179, December.

    More about this item

    NEP fields

    This paper has been announced in the following NEP Reports:

    Statistics

    Access and download statistics

    Corrections

    All material on this site has been provided by the respective publishers and authors. You can help correct errors and omissions. When requesting a correction, please mention this item's handle: RePEc:arx:papers:2511.04957. See general information about how to correct material in RePEc.

    If you have authored this item and are not yet registered with RePEc, we encourage you to do it here. This allows to link your profile to this item. It also allows you to accept potential citations to this item that we are uncertain about.

    If CitEc recognized a bibliographic reference but did not link an item in RePEc to it, you can help with this form .

    If you know of missing items citing this one, you can help us creating those links by adding the relevant references in the same way as above, for each refering item. If you are a registered author of this item, you may also want to check the "citations" tab in your RePEc Author Service profile, as there may be some citations waiting for confirmation.

    For technical questions regarding this item, or to correct its authors, title, abstract, bibliographic or download information, contact: arXiv administrators (email available below). General contact details of provider: http://arxiv.org/ .

    Please note that corrections may take a couple of weeks to filter through the various RePEc services.

    IDEAS is a RePEc service. RePEc uses bibliographic data supplied by the respective publishers.