IDEAS home Printed from https://ideas.repec.org/a/oup/emjrnl/v23y2020i1p1-31..html

Optimal data collection for randomized control trials

Author

Listed:
  • Pedro Carneiro
  • Sokbae Lee
  • Daniel Wilhelm

Abstract

SummaryIn a randomized control trial, the precision of an average treatment effect estimator and the power of the corresponding t-test can be improved either by collecting data on additional individuals, or by collecting additional covariates that predict the outcome variable. To design the experiment, a researcher needs to solve this trade-off subject to her budget constraint. We show that this optimization problem is equivalent to optimally predicting outcomes by the covariates, which in turn can be solved using existing machine learning techniques using pre-experimental data such as other similar studies, a census, or a household survey. In two empirical applications, we show that our procedure can lead to reductions of up to 58% in the costs of data collection, or improvements of the same magnitude in the precision of the treatment effect estimator.

Suggested Citation

  • Pedro Carneiro & Sokbae Lee & Daniel Wilhelm, 2020. "Optimal data collection for randomized control trials," The Econometrics Journal, Royal Economic Society, vol. 23(1), pages 1-31.
  • Handle: RePEc:oup:emjrnl:v:23:y:2020:i:1:p:1-31.
    as

    Download full text from publisher

    File URL: http://hdl.handle.net/10.1093/ectj/utz020
    Download Restriction: Access to full text is restricted to subscribers.
    ---><---

    As the access to this document is restricted, you may want to look for a different version below or

    for a different version of it.

    Other versions of this item:

    Citations

    Citations are extracted by the CitEc Project, subscribe to its RSS feed for this item.
    as


    Cited by:

    1. Karthik Muralidharan & Mauricio Romero & Kaspar Wüthrich, 2025. "Factorial Designs, Model Selection, and (Incorrect) Inference in Randomized Experiments," The Review of Economics and Statistics, MIT Press, vol. 107(3), pages 589-604, May.
    2. Eszter Czibor & David Jimenez‐Gomez & John A. List, 2019. "The Dozen Things Experimental Economists Should Do (More of)," Southern Economic Journal, John Wiley & Sons, vol. 86(2), pages 371-432, October.
    3. Max Tabord-Meehan, 2023. "Stratification Trees for Adaptive Randomisation in Randomised Controlled Trials," The Review of Economic Studies, Review of Economic Studies Ltd, vol. 90(5), pages 2646-2673.
    4. John A. List & Ian Muir & Gregory Sun, 2024. "Using machine learning for efficient flexible regression adjustment in economic experiments," Econometric Reviews, Taylor & Francis Journals, vol. 44(1), pages 2-40, July.
    5. Prakash, Shivendra & Markfort, Corey D., 2022. "A Monte-Carlo based 3-D ballistics model for guiding bat carcass surveys using environmental and turbine operational data," Ecological Modelling, Elsevier, vol. 470(C).
    6. Pons Rotger, Gabriel & Rosholm, Michael, 2020. "The Role of Beliefs in Long Sickness Absence: Experimental Evidence from a Psychological Intervention," IZA Discussion Papers 13582, Institute of Labor Economics (IZA).
    7. Aufenanger, Tobias, 2018. "Treatment allocation for linear models," FAU Discussion Papers in Economics 14/2017, Friedrich-Alexander University Erlangen-Nuremberg, Institute for Economics, revised 2018.

    More about this item

    Keywords

    ;
    ;
    ;

    JEL classification:

    • C55 - Mathematical and Quantitative Methods - - Econometric Modeling - - - Large Data Sets: Modeling and Analysis
    • C81 - Mathematical and Quantitative Methods - - Data Collection and Data Estimation Methodology; Computer Programs - - - Methodology for Collecting, Estimating, and Organizing Microeconomic Data; Data Access

    Statistics

    Access and download statistics

    Corrections

    All material on this site has been provided by the respective publishers and authors. You can help correct errors and omissions. When requesting a correction, please mention this item's handle: RePEc:oup:emjrnl:v:23:y:2020:i:1:p:1-31.. See general information about how to correct material in RePEc.

    If you have authored this item and are not yet registered with RePEc, we encourage you to do it here. This allows to link your profile to this item. It also allows you to accept potential citations to this item that we are uncertain about.

    We have no bibliographic references for this item. You can help adding them by using this form .

    If you know of missing items citing this one, you can help us creating those links by adding the relevant references in the same way as above, for each refering item. If you are a registered author of this item, you may also want to check the "citations" tab in your RePEc Author Service profile, as there may be some citations waiting for confirmation.

    For technical questions regarding this item, or to correct its authors, title, abstract, bibliographic or download information, contact: Oxford University Press (email available below). General contact details of provider: https://edirc.repec.org/data/resssea.html .

    Please note that corrections may take a couple of weeks to filter through the various RePEc services.

    IDEAS is a RePEc service. RePEc uses bibliographic data supplied by the respective publishers.