IDEAS home Printed from https://ideas.repec.org/p/nbr/nberwo/26562.html
   My bibliography  Save this paper

Factorial Designs, Model Selection, and (Incorrect) Inference in Randomized Experiments

Author

Listed:
  • Karthik Muralidharan
  • Mauricio Romero
  • Kaspar Wüthrich

Abstract

Factorial designs are widely used for studying multiple treatments in one experiment. While t-tests based on the “long” model (including main and interaction effects) provide valid inferences against “business-as-usual” counterfactuals, “short” model t-tests (that ignore interactions) yield higher power if the interactions are zero, but incorrect inferences otherwise. Out of 27 factorial experiments published in top-5 journals in 2007–2017, 19 use the short model. We reanalyze these experiments, and show that over half of their published results lose significance when interactions are included. We show that testing the interactions using the long model and presenting the short model if the interactions are not significantly different from zero leads to incorrect inference due to the implied data-dependent model selection. Based on recent econometric advances, we show that local power improvements over the long model are possible. However, if the main effects are of primary interest, leaving the interaction cells empty yields valid inferences and global power improvements. In addition, the sample size needed to detect interactions is substantially larger than that required to detect main effects, resulting in most experiments being under-powered to detect interactions. Thus, using factorial designs to explore whether interactions are meaningful can be problematic because interaction estimates are likely to considerably overestimate the magnitude of the true effect conditional on being significant.

Suggested Citation

  • Karthik Muralidharan & Mauricio Romero & Kaspar Wüthrich, 2019. "Factorial Designs, Model Selection, and (Incorrect) Inference in Randomized Experiments," NBER Working Papers 26562, National Bureau of Economic Research, Inc.
  • Handle: RePEc:nbr:nberwo:26562
    Note: DEV ED HE LS TWP
    as

    Download full text from publisher

    File URL: http://www.nber.org/papers/w26562.pdf
    Download Restriction: no
    ---><---

    Other versions of this item:

    References listed on IDEAS

    as
    1. John A. List & Azeem M. Shaikh & Yang Xu, 2019. "Multiple hypothesis testing in experimental economics," Experimental Economics, Springer;Economic Science Association, vol. 22(4), pages 773-793, December.
    2. Graham Elliott & Ulrich K. Müller & Mark W. Watson, 2015. "Nearly Optimal Tests When a Nuisance Parameter Is Present Under the Null Hypothesis," Econometrica, Econometric Society, vol. 83, pages 771-811, March.
    3. Abhijit V. Banerjee & Shawn Cole & Esther Duflo & Leigh Linden, 2007. "Remedying Education: Evidence from Two Randomized Experiments in India," The Quarterly Journal of Economics, Oxford University Press, vol. 122(3), pages 1235-1264.
    4. Timothy B. Armstrong & Michal Kolesár, 2018. "Optimal Inference in a Class of Regression Models," Econometrica, Econometric Society, vol. 86(2), pages 655-683, March.
    5. Uri Gneezy & Kenneth L. Leonard & John A. List, 2009. "Gender Differences in Competition: Evidence From a Matrilineal and a Patriarchal Society," Econometrica, Econometric Society, vol. 77(5), pages 1637-1664, September.
    6. John List & Sally Sadoff & Mathis Wagner, 2011. "So you want to run an experiment, now what? Some simple rules of thumb for optimal experimental design," Experimental Economics, Springer;Economic Science Association, vol. 14(4), pages 439-457, November.
    7. Duflo, Esther & Dupas, Pascaline & Kremer, Michael, 2015. "School governance, teacher incentives, and pupil–teacher ratios: Experimental evidence from Kenyan primary schools," Journal of Public Economics, Elsevier, vol. 123(C), pages 92-110.
    8. Vivi Alatas & Abhijit Banerjee & Rema Hanna & Benjamin A. Olken & Julia Tobias, 2012. "Targeting the Poor: Evidence from a Field Experiment in Indonesia," American Economic Review, American Economic Association, vol. 102(4), pages 1206-1240, June.
    9. Dean Karlan & John A. List, 2007. "Does Price Matter in Charitable Giving? Evidence from a Large-Scale Natural Field Experiment," American Economic Review, American Economic Association, vol. 97(5), pages 1774-1793, December.
    10. Marianne Bertrand & Dean Karlan & Sendhil Mullainathan & Eldar Shafir & Jonathan Zinman, 2010. "What's Advertising Content Worth? Evidence from a Consumer Credit Marketing Field Experiment," The Quarterly Journal of Economics, Oxford University Press, vol. 125(1), pages 263-306.
    11. Stefan Eriksson & Dan-Olof Rooth, 2014. "Do Employers Use Unemployment as a Sorting Criterion When Hiring? Evidence from a Field Experiment," American Economic Review, American Economic Association, vol. 104(3), pages 1014-1039, March.
    12. Leeb, Hannes & Pötscher, Benedikt M., 2005. "Model Selection And Inference: Facts And Fiction," Econometric Theory, Cambridge University Press, vol. 21(1), pages 21-59, February.
    13. Daniel O. Gilligan & Naureen Karachiwalla & Ibrahim Kasirye & Adrienne M. Lucas & Derek Neal, 2022. "Educator Incentives and Educational Triage in Rural Primary Schools," Journal of Human Resources, University of Wisconsin Press, vol. 57(1), pages 79-111.
    14. Supreet Kaur & Michael Kremer & Sendhil Mullainathan, 2015. "Self-Control at Work," Journal of Political Economy, University of Chicago Press, vol. 123(6), pages 1227-1277.
    15. Amanda Pallais & Emily Glassberg Sands, 2016. "Why the Referential Treatment? Evidence from Field Experiments on Referrals," Journal of Political Economy, University of Chicago Press, vol. 124(6), pages 1793-1828.
    16. Guido W. Imbens & Charles F. Manski, 2004. "Confidence Intervals for Partially Identified Parameters," Econometrica, Econometric Society, vol. 72(6), pages 1845-1857, November.
    17. Blair, Graeme & Cooper, Jasper & Coppock, Alexander & Humphreys, Macartan, 2019. "Declaring and Diagnosing Research Designs," American Political Science Review, Cambridge University Press, vol. 113(3), pages 838-859, August.
    18. Isaac Mbiti & Karthik Muralidharan & Mauricio Romero & Youdi Schipper & Constantine Manda & Rakesh Rajani, 2019. "Inputs, Incentives, and Complementarities in Education: Experimental Evidence from Tanzania," The Quarterly Journal of Economics, Oxford University Press, vol. 134(3), pages 1627-1673.
    19. Miriam Bruhn & David McKenzie, 2009. "In Pursuit of Balance: Randomization in Practice in Development Field Experiments," American Economic Journal: Applied Economics, American Economic Association, vol. 1(4), pages 200-232, October.
    20. Stefano Dellavigna & John A. List & Ulrike Malmendier & Gautam Rao, 2017. "Voting to Tell Others," Review of Economic Studies, Oxford University Press, vol. 84(1), pages 143-181.
    21. Leeb, Hannes & Pötscher, Benedikt M., 2008. "Can One Estimate The Unconditional Distribution Of Post-Model-Selection Estimators?," Econometric Theory, Cambridge University Press, vol. 24(2), pages 338-376, April.
    22. Henrik Jacobsen Kleven & Martin B. Knudsen & Claus Thustrup Kreiner & Søren Pedersen & Emmanuel Saez, 2011. "Unwilling or Unable to Cheat? Evidence From a Tax Audit Experiment in Denmark," Econometrica, Econometric Society, vol. 79(3), pages 651-692, May.
    23. Nava Ashraf & James Berry & Jesse M. Shapiro, 2010. "Can Higher Prices Stimulate Product Use? Evidence from a Field Experiment in Zambia," American Economic Review, American Economic Association, vol. 100(5), pages 2383-2413, December.
    24. Chad Kendall & Tommaso Nannicini & Francesco Trebbi, 2015. "How Do Voters Respond to Information? Evidence from a Randomized Campaign," American Economic Review, American Economic Association, vol. 105(1), pages 322-353, January.
    25. Jorg Stoye, 2009. "More on Confidence Intervals for Partially Identified Parameters," Econometrica, Econometric Society, vol. 77(4), pages 1299-1315, July.
    26. Jessica Cohen & Pascaline Dupas, 2010. "Free Distribution or Cost-Sharing? Evidence from a Randomized Malaria Prevention Experiment," The Quarterly Journal of Economics, Oxford University Press, vol. 125(1), pages 1-45.
    27. Benjamin A. Olken, 2007. "Monitoring Corruption: Evidence from a Field Experiment in Indonesia," Journal of Political Economy, University of Chicago Press, vol. 115, pages 200-249.
    28. Garret Christensen & Edward Miguel, 2018. "Transparency, Reproducibility, and the Credibility of Economics Research," Journal of Economic Literature, American Economic Association, vol. 56(3), pages 920-980, September.
    29. repec:cup:apsrev:v:113:y:2019:i:03:p:838-859_00 is not listed on IDEAS
    30. Dean S. Karlan & Jonathan Zinman, 2008. "Credit Elasticities in Less-Developed Economies: Implications for Microfinance," American Economic Review, American Economic Association, vol. 98(3), pages 1040-1068, June.
    31. Johannes Haushofer & Jeremy Shapiro, 2016. "The Short-term Impact of Unconditional Cash Transfers to the Poor: ExperimentalEvidence from Kenya," The Quarterly Journal of Economics, Oxford University Press, vol. 131(4), pages 1973-2042.
    32. Donald W. K. Andrews & Patrik Guggenberger, 2009. "Hybrid and Size-Corrected Subsampling Methods," Econometrica, Econometric Society, vol. 77(3), pages 721-762, May.
    33. Blair, Graeme & Cooper, Jasper & Coppock, Alexander & Humphreys, Macartan, 2019. "Declaring and Diagnosing Research Designs," EconStor Open Access Articles and Book Chapters, ZBW - Leibniz Information Centre for Economics, pages 838-859.
    Full references (including those not matched with items on IDEAS)

    Citations

    Citations are extracted by the CitEc Project, subscribe to its RSS feed for this item.
    as


    Cited by:

    1. Lee, Sokbae & Salanié, Bernard, 2020. "Filtered and Unfiltered Treatment Effects with Targeting Instruments," CEPR Discussion Papers 15092, C.E.P.R. Discussion Papers.
    2. Clare Leaver & Owen Ozier & Pieter Serneels & Andrew Zeitlin, 2021. "Recruitment, Effort, and Retention Effects of Performance Contracts for Civil Servants: Experimental Evidence from Rwandan Primary Schools," American Economic Review, American Economic Association, vol. 111(7), pages 2213-2246, July.
    3. Andr's Gonz'lez Lira & Ahmed Mushfiq Mobarak, 2018. "Slippery Fish: Enforcing Regulation when Agents Learn and Adapt," Cowles Foundation Discussion Papers 2143R, Cowles Foundation for Research in Economics, Yale University, revised Mar 2021.
    4. Jorge Luis García & James J. Heckman, 2021. "Early childhood education and life‐cycle health," Health Economics, John Wiley & Sons, Ltd., vol. 30(S1), pages 119-141, November.
    5. Erika Deserranno & Philipp Kastrau & Gianmarco León-Ciliotta, 2021. "Promotions and productivity: The role of meritocracy and pay progression in the public sector," Economics Working Papers 1770, Department of Economics and Business, Universitat Pompeu Fabra.
    6. Fernando, A. Nilesh, 2021. "Seeking the treated: The impact of mobile extension on farmer information exchange in India," Journal of Development Economics, Elsevier, vol. 153(C).
    7. Laura Derksen & Jason. T Kerwin & Natalia Ordaz Reynoso & Olivier Sterck, 2021. "Appointments: A More Effective Commitment Device for Health Behaviors," CSAE Working Paper Series 2021-13, Centre for the Study of African Economies, University of Oxford.
    8. Erika Deserranno & Philipp Kastrau & Gianmarco León-Ciliotta, 2021. "Promotions and Productivity: The Role of Meritocracy and Pay Progression in the Public Sector," Working Papers 1239, Barcelona Graduate School of Economics.
    9. Seim, Brigitte & Jablonski, Ryan & Ahlbäck, Johan, 2020. "How information about foreign aid affects public spending decisions: Evidence from a field experiment in Malawi," Journal of Development Economics, Elsevier, vol. 146(C).
    10. Philipp Ketz & Adam McCloskey, 2021. "Short and Simple Confidence Intervals when the Directions of Some Effects are Known," Papers 2109.08222, arXiv.org.
    11. Timothy B. Armstrong & Michal Koles'ar & Soonwoo Kwon, 2020. "Bias-Aware Inference in Regularized Regression Models," Papers 2012.14823, arXiv.org.
    12. Davide Viviano & Kaspar Wuthrich & Paul Niehaus, 2021. "(When) should you adjust inferences for multiple hypothesis testing?," Papers 2104.13367, arXiv.org, revised Nov 2021.

    Most related items

    These are the items that most often cite the same works as this one and are cited by the same works as this one.
    1. Peters, Jörg & Langbein, Jörg & Roberts, Gareth, 2016. "Policy evaluation, randomized controlled trials, and external validity—A systematic review," Economics Letters, Elsevier, vol. 147(C), pages 51-54.
    2. McCloskey, Adam, 2017. "Bonferroni-based size-correction for nonstandard testing problems," Journal of Econometrics, Elsevier, vol. 200(1), pages 17-35.
    3. Jörg Peters & Jörg Langbein & Gareth Roberts, 2018. "Generalization in the Tropics – Development Policy, Randomized Controlled Trials, and External Validity," World Bank Research Observer, World Bank Group, vol. 33(1), pages 34-64.
    4. Guido W. Imbens & Jeffrey M. Wooldridge, 2009. "Recent Developments in the Econometrics of Program Evaluation," Journal of Economic Literature, American Economic Association, vol. 47(1), pages 5-86, March.
    5. Eszter Czibor & David Jimenez‐Gomez & John A. List, 2019. "The Dozen Things Experimental Economists Should Do (More of)," Southern Economic Journal, John Wiley & Sons, vol. 86(2), pages 371-432, October.
    6. Levitt, Steven D. & List, John A., 2009. "Field experiments in economics: The past, the present, and the future," European Economic Review, Elsevier, vol. 53(1), pages 1-18, January.
    7. Meredith, Jennifer & Robinson, Jonathan & Walker, Sarah & Wydick, Bruce, 2013. "Keeping the doctor away: Experimental evidence on investment in preventative health products," Journal of Development Economics, Elsevier, vol. 105(C), pages 196-210.
    8. Jason T. Kerwin & Rebecca L. Thornton, 2021. "Making the Grade: The Sensitivity of Education Program Effectiveness to Input Choices and Outcome Measures," The Review of Economics and Statistics, MIT Press, vol. 103(2), pages 251-264, May.
    9. Abhijit V. Banerjee & Esther Duflo, 2009. "The Experimental Approach to Development Economics," Annual Review of Economics, Annual Reviews, vol. 1(1), pages 151-178, May.
    10. Karthik Muralidharan & Paul Niehaus, 2017. "Experimentation at Scale," Journal of Economic Perspectives, American Economic Association, vol. 31(4), pages 103-124, Fall.
    11. Abhijit V. Banerjee & Esther Duflo, 2010. "Giving Credit Where It Is Due," Journal of Economic Perspectives, American Economic Association, vol. 24(3), pages 61-80, Summer.
    12. Sadoff, Sally & Samek, Anya, 2019. "Can interventions affect commitment demand? A field experiment on food choice," Journal of Economic Behavior & Organization, Elsevier, vol. 158(C), pages 90-109.
    13. B Kelsey Jack, "undated". "Market Inefficiencies and the Adoption of Agricultural Technologies in Developing Countries," CID Working Papers 50, Center for International Development at Harvard University.
    14. Mo, Di & Bai, Yu & Shi, Yaojiang & Abbey, Cody & Zhang, Linxiu & Rozelle, Scott & Loyalka, Prashant, 2020. "Institutions, implementation, and program effectiveness: Evidence from a randomized evaluation of computer-assisted learning in rural China," Journal of Development Economics, Elsevier, vol. 146(C).
    15. Lant Pritchett & Salimah Samji & Jeffrey S. Hammer, 2012. "It's All about MeE: Using Structured Experiential Learning ('e') to Crawl the Design Space," WIDER Working Paper Series wp-2012-104, World Institute for Development Economic Research (UNU-WIDER).
    16. G�nther Fink & Margaret McConnell & Sebastian Vollmer, 2014. "Testing for heterogeneous treatment effects in experimental data: false discovery risks and correction procedures," Journal of Development Effectiveness, Taylor & Francis Journals, vol. 6(1), pages 44-57, January.
    17. Raymond P. Guiteras & B. Kelsey Jack, 2014. "Incentives, Selection and Productivity in Labor Markets: Evidence from Rural Malawi," NBER Working Papers 19825, National Bureau of Economic Research, Inc.
    18. Michael Grimm & Anicet Munyehirwe & Jörg Peters & Maximiliane Sievert, 2017. "A First Step up the Energy Ladder? Low Cost Solar Kits and Household’s Welfare in Rural Rwanda," World Bank Economic Review, World Bank Group, vol. 31(3), pages 631-649.
    19. Cherchye, Laurens & Demuynck, Thomas & Rock, Bram De, 2019. "Bounding counterfactual demand with unobserved heterogeneity and endogenous expenditures," Journal of Econometrics, Elsevier, vol. 211(2), pages 483-506.
    20. Paulina Oliva & B. Kelsey Jack & Samuel Bell & Elizabeth Mettetal & Christopher Severen, 2020. "Technology Adoption under Uncertainty: Take-Up and Subsequent Investment in Zambia," The Review of Economics and Statistics, MIT Press, vol. 102(3), pages 617-632, July.

    More about this item

    JEL classification:

    • C12 - Mathematical and Quantitative Methods - - Econometric and Statistical Methods and Methodology: General - - - Hypothesis Testing: General
    • C18 - Mathematical and Quantitative Methods - - Econometric and Statistical Methods and Methodology: General - - - Methodolical Issues: General
    • C90 - Mathematical and Quantitative Methods - - Design of Experiments - - - General
    • C93 - Mathematical and Quantitative Methods - - Design of Experiments - - - Field Experiments

    NEP fields

    This paper has been announced in the following NEP Reports:

    Statistics

    Access and download statistics

    Corrections

    All material on this site has been provided by the respective publishers and authors. You can help correct errors and omissions. When requesting a correction, please mention this item's handle: RePEc:nbr:nberwo:26562. See general information about how to correct material in RePEc.

    For technical questions regarding this item, or to correct its authors, title, abstract, bibliographic or download information, contact: . General contact details of provider: https://edirc.repec.org/data/nberrus.html .

    If you have authored this item and are not yet registered with RePEc, we encourage you to do it here. This allows to link your profile to this item. It also allows you to accept potential citations to this item that we are uncertain about.

    If CitEc recognized a bibliographic reference but did not link an item in RePEc to it, you can help with this form .

    If you know of missing items citing this one, you can help us creating those links by adding the relevant references in the same way as above, for each refering item. If you are a registered author of this item, you may also want to check the "citations" tab in your RePEc Author Service profile, as there may be some citations waiting for confirmation.

    For technical questions regarding this item, or to correct its authors, title, abstract, bibliographic or download information, contact: (email available below). General contact details of provider: https://edirc.repec.org/data/nberrus.html .

    Please note that corrections may take a couple of weeks to filter through the various RePEc services.

    IDEAS is a RePEc service hosted by the Research Division of the Federal Reserve Bank of St. Louis . RePEc uses bibliographic data supplied by the respective publishers.