IDEAS home Printed from https://ideas.repec.org/p/zbw/iwqwdp/162017.html
   My bibliography  Save this paper

Machine learning to improve experimental design

Author

Listed:
  • Aufenanger, Tobias

Abstract

This paper proposes a way of using observational pretest data for the design of experiments. In particular, this paper trains a random forest on the pretest data and stratifies the allocation of treatments to experimental units on the predicted dependent variables. This approach reduces much of the arbitrariness involved in defining strata directly on the basis of covariates. A simulation on 300 random samples drawn from six data sets shows that this algorithm is extremely effective in reducing the variance of the estimation compared to random allocation and to traditional ways of stratification. On average, this stratification approach requires half the sample size to estimate the treatment effect with the same precision as complete randomization. In more than 80% of all samples the estimated variance of the treatment estimator is lower and the estimated statistical power is higher than for standard designs such as complete randomization, conventional stratification or Mahalanobis matching.

Suggested Citation

  • Aufenanger, Tobias, 2017. "Machine learning to improve experimental design," FAU Discussion Papers in Economics 16/2017, Friedrich-Alexander University Erlangen-Nuremberg, Institute for Economics, revised 2017.
  • Handle: RePEc:zbw:iwqwdp:162017
    as

    Download full text from publisher

    File URL: https://www.econstor.eu/bitstream/10419/169116/1/898624746.pdf
    Download Restriction: no
    ---><---

    References listed on IDEAS

    as
    1. Jinyong Hahn & Keisuke Hirano & Dean Karlan, 2011. "Adaptive Experimental Design Using the Propensity Score," Journal of Business & Economic Statistics, Taylor & Francis Journals, vol. 29(1), pages 96-108, January.
    2. Bryan S. Graham, 2008. "Identifying Social Interactions Through Conditional Variance Restrictions," Econometrica, Econometric Society, vol. 76(3), pages 643-660, May.
    3. Alan B. Krueger, 1999. "Experimental Estimates of Education Production Functions," The Quarterly Journal of Economics, President and Fellows of Harvard College, vol. 114(2), pages 497-532.
    4. Kasy, Maximilian, 2016. "Why Experimenters Might Not Always Want to Randomize, and What They Could Do Instead," Political Analysis, Cambridge University Press, vol. 24(3), pages 324-338, July.
    5. Nava Ashraf & Dean Karlan & Wesley Yin, 2006. "Tying Odysseus to the Mast: Evidence From a Commitment Savings Product in the Philippines," The Quarterly Journal of Economics, President and Fellows of Harvard College, vol. 121(2), pages 635-672.
    6. Miriam Bruhn & David McKenzie, 2009. "In Pursuit of Balance: Randomization in Practice in Development Field Experiments," American Economic Journal: Applied Economics, American Economic Association, vol. 1(4), pages 200-232, October.
    7. Aufenanger, Tobias, 2018. "Treatment allocation for linear models," FAU Discussion Papers in Economics 14/2017, Friedrich-Alexander University Erlangen-Nuremberg, Institute for Economics, revised 2018.
    8. Moore, Ryan T., 2012. "Multivariate Continuous Blocking to Improve Political Science Experiments," Political Analysis, Cambridge University Press, vol. 20(4), pages 460-479.
    9. Imbens,Guido W. & Rubin,Donald B., 2015. "Causal Inference for Statistics, Social, and Biomedical Sciences," Cambridge Books, Cambridge University Press, number 9780521885881.
    Full references (including those not matched with items on IDEAS)

    Citations

    Citations are extracted by the CitEc Project, subscribe to its RSS feed for this item.
    as


    Cited by:

    1. David Mayer-Foulkes, 2018. "Efficient Urbanization for Mexican Development," International Journal of Economics and Finance, Canadian Center of Science and Education, vol. 10(10), pages 1-1, October.

    Most related items

    These are the items that most often cite the same works as this one and are cited by the same works as this one.
    1. Pedro Carneiro & Sokbae Lee & Daniel Wilhelm, 2020. "Optimal data collection for randomized control trials [Microcredit impacts: Evidence from a randomized microcredit program placement experiment by Compartamos Banco]," The Econometrics Journal, Royal Economic Society, vol. 23(1), pages 1-31.
    2. Yusuke Narita, 2018. "Experiment-as-Market: Incorporating Welfare into Randomized Controlled Trials," Cowles Foundation Discussion Papers 2127r, Cowles Foundation for Research in Economics, Yale University, revised May 2019.
    3. Pedro Carneiro & Sokbae (Simon) Lee & Daniel Wilhelm, 2017. "Optimal data collection for randomized control trials," CeMMAP working papers 45/17, Institute for Fiscal Studies.
    4. Timothy B. Armstrong & Shu Shen, 2013. "Inference on Optimal Treatment Assignments," Cowles Foundation Discussion Papers 1927RR, Cowles Foundation for Research in Economics, Yale University, revised Apr 2015.
    5. Yusuke Narita, 2018. "Toward an Ethical Experiment," Cowles Foundation Discussion Papers 2127, Cowles Foundation for Research in Economics, Yale University.
    6. Sven Resnjanskij & Jens Ruhose & Simon Wiederhold & Ludger Woessmann & Katharina Wedel, 2024. "Can Mentoring Alleviate Family Disadvantage in Adolescence? A Field Experiment to Improve Labor Market Prospects," Journal of Political Economy, University of Chicago Press, vol. 132(3), pages 1013-1062.
    7. Aufenanger, Tobias, 2018. "Treatment allocation for linear models," FAU Discussion Papers in Economics 14/2017, Friedrich-Alexander University Erlangen-Nuremberg, Institute for Economics, revised 2018.
    8. Eszter Czibor & David Jimenez‐Gomez & John A. List, 2019. "The Dozen Things Experimental Economists Should Do (More of)," Southern Economic Journal, John Wiley & Sons, vol. 86(2), pages 371-432, October.
    9. Pedro Carneiro & Sokbae (Simon) Lee & Daniel Wilhelm, 2016. "Optimal data collection for randomized control trials," CeMMAP working papers 15/16, Institute for Fiscal Studies.
    10. Max Tabord-Meehan, 2018. "Stratification Trees for Adaptive Randomization in Randomized Controlled Trials," Papers 1806.05127, arXiv.org, revised Jul 2022.
    11. Yong Cai & Ahnaf Rafi, 2022. "On the Performance of the Neyman Allocation with Small Pilots," Papers 2206.04643, arXiv.org, revised Mar 2024.
    12. Max Cytrynbaum, 2021. "Optimal Stratification of Survey Experiments," Papers 2111.08157, arXiv.org, revised Aug 2023.
    13. Yichong Zhang & Xin Zheng, 2020. "Quantile treatment effects and bootstrap inference under covariate‐adaptive randomization," Quantitative Economics, Econometric Society, vol. 11(3), pages 957-982, July.
    14. Pedro Carneiro & Sokbae (Simon) Lee & Daniel Wilhelm, 2017. "Optimal data collection for randomized control trials," CeMMAP working papers 15/17, Institute for Fiscal Studies.
    15. Alexandre Belloni & Victor Chernozhukov & Denis Chetverikov & Christian Hansen & Kengo Kato, 2018. "High-dimensional econometrics and regularized GMM," CeMMAP working papers CWP35/18, Centre for Microdata Methods and Practice, Institute for Fiscal Studies.
    16. Clément de Chaisemartin & Jaime Ramirez-Cuellar, 2024. "At What Level Should One Cluster Standard Errors in Paired and Small-Strata Experiments?," American Economic Journal: Applied Economics, American Economic Association, vol. 16(1), pages 193-212, January.
    17. Suresh de Mel & David McKenzie & Christopher Woodruff, 2019. "Labor Drops: Experimental Evidence on the Return to Additional Labor in Microenterprises," American Economic Journal: Applied Economics, American Economic Association, vol. 11(1), pages 202-235, January.
    18. Beaman, Lori & Karlan, Dean S. & Thuysbaert, Bram, 2014. "Saving for a (not so) Rainy Day: A Randomized Evaluation of Savings Groups in Mali," Center Discussion Papers 187189, Yale University, Economic Growth Center.
    19. Patrizia Lattarulo & Marco Mariani & Laura Razzolini, 2017. "Nudging museums attendance: a field experiment with high school teens," Journal of Cultural Economics, Springer;The Association for Cultural Economics International, vol. 41(3), pages 259-277, August.
    20. Raj Chetty & John N. Friedman & Nathaniel Hilger & Emmanuel Saez & Diane Whitmore Schanzenbach & Danny Yagan, 2011. "How Does Your Kindergarten Classroom Affect Your Earnings? Evidence from Project Star," The Quarterly Journal of Economics, President and Fellows of Harvard College, vol. 126(4), pages 1593-1660.

    More about this item

    Keywords

    experiment design; treatment allocation;

    JEL classification:

    • C14 - Mathematical and Quantitative Methods - - Econometric and Statistical Methods and Methodology: General - - - Semiparametric and Nonparametric Methods: General
    • C15 - Mathematical and Quantitative Methods - - Econometric and Statistical Methods and Methodology: General - - - Statistical Simulation Methods: General
    • C90 - Mathematical and Quantitative Methods - - Design of Experiments - - - General

    NEP fields

    This paper has been announced in the following NEP Reports:

    Statistics

    Access and download statistics

    Corrections

    All material on this site has been provided by the respective publishers and authors. You can help correct errors and omissions. When requesting a correction, please mention this item's handle: RePEc:zbw:iwqwdp:162017. See general information about how to correct material in RePEc.

    If you have authored this item and are not yet registered with RePEc, we encourage you to do it here. This allows to link your profile to this item. It also allows you to accept potential citations to this item that we are uncertain about.

    If CitEc recognized a bibliographic reference but did not link an item in RePEc to it, you can help with this form .

    If you know of missing items citing this one, you can help us creating those links by adding the relevant references in the same way as above, for each refering item. If you are a registered author of this item, you may also want to check the "citations" tab in your RePEc Author Service profile, as there may be some citations waiting for confirmation.

    For technical questions regarding this item, or to correct its authors, title, abstract, bibliographic or download information, contact: ZBW - Leibniz Information Centre for Economics (email available below). General contact details of provider: https://edirc.repec.org/data/vierlde.html .

    Please note that corrections may take a couple of weeks to filter through the various RePEc services.

    IDEAS is a RePEc service. RePEc uses bibliographic data supplied by the respective publishers.