IDEAS home Printed from https://ideas.repec.org/p/iza/izadps/dp9908.html
   My bibliography  Save this paper

Optimal Data Collection for Randomized Control Trials

Author

Listed:
  • Carneiro, Pedro

    (University College London)

  • Lee, Sokbae

    (Institute for Fiscal Studies, London)

  • Wilhelm, Daniel

    (University College London)

Abstract

In a randomized control trial, the precision of an average treatment effect estimator can be improved either by collecting data on additional individuals, or by collecting additional covariates that predict the outcome variable. We propose the use of pre-experimental data such as a census, or a household survey, to inform the choice of both the sample size and the covariates to be collected. Our procedure seeks to minimize the resulting average treatment effect estimator's mean squared error, subject to the researcher's budget constraint. We rely on a modification of an orthogonal greedy algorithm that is conceptually simple and easy to implement in the presence of a large number of potential covariates, and does not require any tuning parameters. In two empirical applications, we show that our procedure can lead to substantial gains of up to 58%, measured either in terms of reductions in data collection costs or in terms of improvements in the precision of the treatment effect estimator.

Suggested Citation

  • Carneiro, Pedro & Lee, Sokbae & Wilhelm, Daniel, 2016. "Optimal Data Collection for Randomized Control Trials," IZA Discussion Papers 9908, Institute of Labor Economics (IZA).
  • Handle: RePEc:iza:izadps:dp9908
    as

    Download full text from publisher

    File URL: https://docs.iza.org/dp9908.pdf
    Download Restriction: no
    ---><---

    Other versions of this item:

    References listed on IDEAS

    as
    1. Jinyong Hahn & Keisuke Hirano & Dean Karlan, 2011. "Adaptive Experimental Design Using the Propensity Score," Journal of Business & Economic Statistics, Taylor & Francis Journals, vol. 29(1), pages 96-108, January.
    2. Abhijit Banerjee & Sylvain Chassang & Sergio Montero & Erik Snowberg, 2017. "A Theory of Experimenters," NBER Working Papers 23867, National Bureau of Economic Research, Inc.
    3. Jon Kleinberg & Jens Ludwig & Sendhil Mullainathan & Ziad Obermeyer, 2015. "Prediction Policy Problems," American Economic Review, American Economic Association, vol. 105(5), pages 491-495, May.
    4. Alessandro Tarozzi & Jaikishan Desai & Kristin Johnson, 2015. "The Impacts of Microcredit: Evidence from Ethiopia," American Economic Journal: Applied Economics, American Economic Association, vol. 7(1), pages 54-89, January.
    5. Victor Chernozhukov & Denis Chetverikov & Mert Demirer & Esther Duflo & Christian Hansen & Whitney Newey & James Robins, 2018. "Double/debiased machine learning for treatment and structural parameters," Econometrics Journal, Royal Economic Society, vol. 21(1), pages 1-68, February.
    6. Kari Lock Morgan & Donald B. Rubin, 2015. "Rerandomization to Balance Tiers of Covariates," Journal of the American Statistical Association, Taylor & Francis Journals, vol. 110(512), pages 1412-1421, December.
    7. Alexandre Belloni & Victor Chernozhukov & Christian Hansen, 2014. "High-Dimensional Methods and Inference on Structural and Treatment Effects," Journal of Economic Perspectives, American Economic Association, vol. 28(2), pages 29-50, Spring.
    8. Michael Kremer & Edward Miguel & Rebecca Thornton, 2009. "Incentives to Learn," The Review of Economics and Statistics, MIT Press, vol. 91(3), pages 437-456, August.
    9. John List & Sally Sadoff & Mathis Wagner, 2011. "So you want to run an experiment, now what? Some simple rules of thumb for optimal experimental design," Experimental Economics, Springer;Economic Science Association, vol. 14(4), pages 439-457, November.
    10. Duflo, Esther & Dupas, Pascaline & Kremer, Michael, 2015. "School governance, teacher incentives, and pupil–teacher ratios: Experimental evidence from Kenyan primary schools," Journal of Public Economics, Elsevier, vol. 123(C), pages 92-110.
    11. Britta Augsburg & Ralph De Haas & Heike Harmgart & Costas Meghir, 2015. "The Impacts of Microcredit: Evidence from Bosnia and Herzegovina," American Economic Journal: Applied Economics, American Economic Association, vol. 7(1), pages 183-203, January.
    12. Paul Glewwe & Nauman Ilias & Michael Kremer, 2010. "Teacher Incentives," American Economic Journal: Applied Economics, American Economic Association, vol. 2(3), pages 205-227, July.
    13. Kasy, Maximilian, 2016. "Why Experimenters Might Not Always Want to Randomize, and What They Could Do Instead," Political Analysis, Cambridge University Press, vol. 24(3), pages 324-338, July.
    14. Brendon McConnell & Marcos Vera-Hernandez, 2015. "Going beyond simple sample size calculations: a practitioner's guide," IFS Working Papers W15/17, Institute for Fiscal Studies.
    15. Oriana Bandiera & Iwan Barankay & Imran Rasul, 2011. "Field Experiments with Firms," Journal of Economic Perspectives, American Economic Association, vol. 25(3), pages 63-82, Summer.
    16. Manuela Angelucci & Dean Karlan & Jonathan Zinman, 2015. "Microcredit Impacts: Evidence from a Randomized Microcredit Program Placement Experiment by Compartamos Banco," American Economic Journal: Applied Economics, American Economic Association, vol. 7(1), pages 151-182, January.
    17. Amy Finkelstein & Sarah Taubman & Bill Wright & Mira Bernstein & Jonathan Gruber & Joseph P. Newhouse & Heidi Allen & Katherine Baicker, 2012. "The Oregon Health Insurance Experiment: Evidence from the First Year," The Quarterly Journal of Economics, President and Fellows of Harvard College, vol. 127(3), pages 1057-1106.
    18. Esther Duflo & Pascaline Dupas & Michael Kremer, 2011. "Peer Effects, Teacher Incentives, and the Impact of Tracking: Evidence from a Randomized Evaluation in Kenya," American Economic Review, American Economic Association, vol. 101(5), pages 1739-1774, August.
    19. Benjamin A. Olken, 2015. "Promises and Perils of Pre-analysis Plans," Journal of Economic Perspectives, American Economic Association, vol. 29(3), pages 61-80, Summer.
    20. Sylvie Moulin & Michael Kremer & Paul Glewwe, 2009. "Many Children Left Behind? Textbooks and Test Scores in Kenya," American Economic Journal: Applied Economics, American Economic Association, vol. 1(1), pages 112-135, January.
    21. Bruno Crépon & Florencia Devoto & Esther Duflo & William Parienté, 2015. "Estimating the Impact of Microcredit on Those Who Take It Up: Evidence from a Randomized Experiment in Morocco," American Economic Journal: Applied Economics, American Economic Association, vol. 7(1), pages 123-150, January.
    22. Duflo, Esther & Glennerster, Rachel & Kremer, Michael, 2008. "Using Randomization in Development Economics Research: A Toolkit," Handbook of Development Economics, in: T. Paul Schultz & John A. Strauss (ed.), Handbook of Development Economics, edition 1, volume 4, chapter 61, pages 3895-3962, Elsevier.
    23. List, John A. & Rasul, Imran, 2011. "Field Experiments in Labor Economics," Handbook of Labor Economics, in: O. Ashenfelter & D. Card (ed.), Handbook of Labor Economics, edition 1, volume 4, chapter 2, pages 103-228, Elsevier.
    24. Aleksey Tetenov, 2016. "An economic theory of statistical testing," CeMMAP working papers CWP50/16, Centre for Microdata Methods and Practice, Institute for Fiscal Studies.
    25. Daniel S. Hamermesh, 2013. "Six Decades of Top Economics Publishing: Who and How?," Journal of Economic Literature, American Economic Association, vol. 51(1), pages 162-172, March.
    26. Abhijit V. Banerjee & Esther Duflo, 2009. "The Experimental Approach to Development Economics," Annual Review of Economics, Annual Reviews, vol. 1(1), pages 151-178, May.
    27. Orazio Attanasio & Britta Augsburg & Ralph De Haas & Emla Fitzsimons & Heike Harmgart, 2015. "The Impacts of Microfinance: Evidence from Joint-Liability Lending in Mongolia," American Economic Journal: Applied Economics, American Economic Association, vol. 7(1), pages 90-122, January.
    28. Miriam Bruhn & David McKenzie, 2009. "In Pursuit of Balance: Randomization in Practice in Development Field Experiments," American Economic Journal: Applied Economics, American Economic Association, vol. 1(4), pages 200-232, October.
    29. Meghir, Costas & Mommaerts, Corina & Carneiro, Pedro & Koussihouede, Oswald & Lahire, Nathalie, 2015. "Decentralizing Education Resources: School Grants in Senegal," Center Discussion Papers 201691, Yale University, Economic Growth Center.
    30. John A. List, 2011. "Why Economists Should Conduct Field Experiments and 14 Tips for Pulling One Off," Journal of Economic Perspectives, American Economic Association, vol. 25(3), pages 3-16, Summer.
    31. Bhattacharya, Debopam & Dupas, Pascaline, 2012. "Inferring welfare maximizing treatment assignment under budget constraints," Journal of Econometrics, Elsevier, vol. 167(1), pages 168-196.
    32. Edward Miguel & Michael Kremer, 2004. "Worms: Identifying Impacts on Education and Health in the Presence of Treatment Externalities," Econometrica, Econometric Society, vol. 72(1), pages 159-217, January.
    33. Abhijit Banerjee & Esther Duflo & Rachel Glennerster & Cynthia Kinnan, 2015. "The Miracle of Microfinance? Evidence from a Randomized Evaluation," American Economic Journal: Applied Economics, American Economic Association, vol. 7(1), pages 22-53, January.
    34. McKenzie, David, 2012. "Beyond baseline and follow-up: The case for more T in experiments," Journal of Development Economics, Elsevier, vol. 99(2), pages 210-221.
    35. Abhijit Banerjee & Dean Karlan & Jonathan Zinman, 2015. "Six Randomized Evaluations of Microcredit: Introduction and Further Steps," American Economic Journal: Applied Economics, American Economic Association, vol. 7(1), pages 1-21, January.
    36. Lucas C. Coffman & Muriel Niederle, 2015. "Pre-analysis Plans Have Limited Upside, Especially Where Replications Are Feasible," Journal of Economic Perspectives, American Economic Association, vol. 29(3), pages 81-98, Summer.
    37. repec:feb:artefa:0110 is not listed on IDEAS
    38. Jeff Dominitz & Charles F. Manski, 2017. "More Data or Better Data? A Statistical Decision Problem," The Review of Economic Studies, Review of Economic Studies Ltd, vol. 84(4), pages 1583-1605.
    39. Imbens,Guido W. & Rubin,Donald B., 2015. "Causal Inference for Statistics, Social, and Biomedical Sciences," Cambridge Books, Cambridge University Press, number 9780521885881.
    Full references (including those not matched with items on IDEAS)

    Citations

    Citations are extracted by the CitEc Project, subscribe to its RSS feed for this item.
    as


    Cited by:

    1. Karthik Muralidharan & Mauricio Romero & Kaspar Wüthrich, 2019. "Factorial Designs, Model Selection, and (Incorrect) Inference in Randomized Experiments," NBER Working Papers 26562, National Bureau of Economic Research, Inc.
    2. Eszter Czibor & David Jimenez‐Gomez & John A. List, 2019. "The Dozen Things Experimental Economists Should Do (More of)," Southern Economic Journal, John Wiley & Sons, vol. 86(2), pages 371-432, October.
    3. Max Tabord-Meehan, 2018. "Stratification Trees for Adaptive Randomization in Randomized Controlled Trials," Papers 1806.05127, arXiv.org, revised Jul 2022.
    4. John A. List & Ian Muir & Gregory K. Sun, 2022. "Using Machine Learning for Efficient Flexible Regression Adjustment in Economic Experiments," NBER Working Papers 30756, National Bureau of Economic Research, Inc.
    5. Prakash, Shivendra & Markfort, Corey D., 2022. "A Monte-Carlo based 3-D ballistics model for guiding bat carcass surveys using environmental and turbine operational data," Ecological Modelling, Elsevier, vol. 470(C).
    6. Pons Rotger, Gabriel & Rosholm, Michael, 2020. "The Role of Beliefs in Long Sickness Absence: Experimental Evidence from a Psychological Intervention," IZA Discussion Papers 13582, Institute of Labor Economics (IZA).
    7. Aufenanger, Tobias, 2018. "Treatment allocation for linear models," FAU Discussion Papers in Economics 14/2017, Friedrich-Alexander University Erlangen-Nuremberg, Institute for Economics, revised 2018.

    Most related items

    These are the items that most often cite the same works as this one and are cited by the same works as this one.
    1. Pedro Carneiro & Sokbae (Simon) Lee & Daniel Wilhelm, 2017. "Optimal data collection for randomized control trials," CeMMAP working papers 45/17, Institute for Fiscal Studies.
    2. Pedro Carneiro & Sokbae (Simon) Lee & Daniel Wilhelm, 2016. "Optimal data collection for randomized control trials," CeMMAP working papers 15/16, Institute for Fiscal Studies.
    3. Pedro Carneiro & Sokbae (Simon) Lee & Daniel Wilhelm, 2017. "Optimal data collection for randomized control trials," CeMMAP working papers 15/17, Institute for Fiscal Studies.
    4. Eduard Marinov, 2019. "The 2019 Nobel Prize in Economics," Economic Thought journal, Bulgarian Academy of Sciences - Economic Research Institute, issue 6, pages 78-116.
    5. Dahal, Mahesh & Fiala, Nathan, 2020. "What do we know about the impact of microfinance? The problems of statistical power and precision," World Development, Elsevier, vol. 128(C).
    6. Eszter Czibor & David Jimenez‐Gomez & John A. List, 2019. "The Dozen Things Experimental Economists Should Do (More of)," Southern Economic Journal, John Wiley & Sons, vol. 86(2), pages 371-432, October.
    7. Victor Chernozhukov & Mert Demirer & Esther Duflo & Ivan Fernandez-Val, 2017. "Generic machine learning inference on heterogenous treatment effects in randomized experiments," CeMMAP working papers 61/17, Institute for Fiscal Studies.
    8. Committee, Nobel Prize, 2019. "Understanding development and poverty alleviation," Nobel Prize in Economics documents 2019-2, Nobel Prize Committee.
    9. Dahal, Mahesh & Fiala, Nathan, 2018. "What do we know about the impact of microfinance? The problems of power and precision," Ruhr Economic Papers 756, RWI - Leibniz-Institut für Wirtschaftsforschung, Ruhr-University Bochum, TU Dortmund University, University of Duisburg-Essen.
    10. Cai, Shu, 2020. "Migration under liquidity constraints: Evidence from randomized credit access in China," Journal of Development Economics, Elsevier, vol. 142(C).
    11. Lucia Dalla Pellegrina & Giorgio Di Maio & Paolo Landoni & Emanuele Rusinà, 2021. "Money management and entrepreneurial training in microfinance: impact on beneficiaries and institutions," Economia Politica: Journal of Analytical and Institutional Economics, Springer;Fondazione Edison, vol. 38(3), pages 1049-1085, October.
    12. Emily Breza & Cynthia Kinnan, 2021. "Measuring the Equilibrium Impacts of Credit: Evidence from the Indian Microfinance Crisis," The Quarterly Journal of Economics, President and Fellows of Harvard College, vol. 136(3), pages 1447-1497.
    13. N'dri, Lasme Mathieu & Kakinaka, Makoto, 2020. "Financial inclusion, mobile money, and individual welfare: The case of Burkina Faso," Telecommunications Policy, Elsevier, vol. 44(3).
    14. Ahlin, Christian & Gulesci, Selim & Madestam, Andreas & Stryjan, Miri, 2020. "Loan contract structure and adverse selection: Survey evidence from Uganda," Journal of Economic Behavior & Organization, Elsevier, vol. 172(C), pages 180-195.
    15. Gyorgy Molnar & Attila Havas, 2019. "Escaping from the poverty trap with social innovation: a social microcredit programme in Hungary," CERS-IE WORKING PAPERS 1912, Institute of Economics, Centre for Economic and Regional Studies.
    16. Karlan, Dean & Osman, Adam & Zinman, Jonathan, 2016. "Follow the money not the cash: Comparing methods for identifying consumption and investment responses to a liquidity shock," Journal of Development Economics, Elsevier, vol. 121(C), pages 11-23.
    17. Nakano, Yuko & Magezi, Eustadius F., 2020. "The impact of microcredit on agricultural technology adoption and productivity: Evidence from randomized control trial in Tanzania," World Development, Elsevier, vol. 133(C).
    18. Tamara Broderick & Ryan Giordano & Rachael Meager, 2020. "An Automatic Finite-Sample Robustness Metric: When Can Dropping a Little Data Make a Big Difference?," Papers 2011.14999, arXiv.org, revised Jul 2023.
    19. Lota Tamini & Ibrahima Bocoum & Ghislain Auger & Kotchikpa Gabriel Lawin & Arahama Traoré, 2019. "Enhanced Microfinance Services and Agricultural Best Management Practices: What Benefits for Smallholders Farmers? An Evidence from Burkina Faso," CIRANO Working Papers 2019s-11, CIRANO.
    20. Susmita Baulia, 2017. "Take-up of joint and individual liability loans: an analysis with laboratory experiments," Discussion Papers 117, Aboa Centre for Economics.

    More about this item

    Keywords

    optimal survey design; data collection; big data; randomized control trials; orthogonal greedy algorithm; survey costs;
    All these keywords.

    JEL classification:

    • C55 - Mathematical and Quantitative Methods - - Econometric Modeling - - - Large Data Sets: Modeling and Analysis
    • C81 - Mathematical and Quantitative Methods - - Data Collection and Data Estimation Methodology; Computer Programs - - - Methodology for Collecting, Estimating, and Organizing Microeconomic Data; Data Access

    NEP fields

    This paper has been announced in the following NEP Reports:

    Statistics

    Access and download statistics

    Corrections

    All material on this site has been provided by the respective publishers and authors. You can help correct errors and omissions. When requesting a correction, please mention this item's handle: RePEc:iza:izadps:dp9908. See general information about how to correct material in RePEc.

    If you have authored this item and are not yet registered with RePEc, we encourage you to do it here. This allows to link your profile to this item. It also allows you to accept potential citations to this item that we are uncertain about.

    If CitEc recognized a bibliographic reference but did not link an item in RePEc to it, you can help with this form .

    If you know of missing items citing this one, you can help us creating those links by adding the relevant references in the same way as above, for each refering item. If you are a registered author of this item, you may also want to check the "citations" tab in your RePEc Author Service profile, as there may be some citations waiting for confirmation.

    For technical questions regarding this item, or to correct its authors, title, abstract, bibliographic or download information, contact: Holger Hinte (email available below). General contact details of provider: https://edirc.repec.org/data/izaaade.html .

    Please note that corrections may take a couple of weeks to filter through the various RePEc services.

    IDEAS is a RePEc service. RePEc uses bibliographic data supplied by the respective publishers.