IDEAS home Printed from https://ideas.repec.org/p/arx/papers/1707.01473.html
   My bibliography  Save this paper

Machine-Learning Tests for Effects on Multiple Outcomes

Author

Listed:
  • Jens Ludwig
  • Sendhil Mullainathan
  • Jann Spiess

Abstract

In this paper we present tools for applied researchers that re-purpose off-the-shelf methods from the computer-science field of machine learning to create a "discovery engine" for data from randomized controlled trials (RCTs). The applied problem we seek to solve is that economists invest vast resources into carrying out RCTs, including the collection of a rich set of candidate outcome measures. But given concerns about inference in the presence of multiple testing, economists usually wind up exploring just a small subset of the hypotheses that the available data could be used to test. This prevents us from extracting as much information as possible from each RCT, which in turn impairs our ability to develop new theories or strengthen the design of policy interventions. Our proposed solution combines the basic intuition of reverse regression, where the dependent variable of interest now becomes treatment assignment itself, with methods from machine learning that use the data themselves to flexibly identify whether there is any function of the outcomes that predicts (or has signal about) treatment group status. This leads to correctly-sized tests with appropriate $p$-values, which also have the important virtue of being easy to implement in practice. One open challenge that remains with our work is how to meaningfully interpret the signal that these methods find.

Suggested Citation

  • Jens Ludwig & Sendhil Mullainathan & Jann Spiess, 2017. "Machine-Learning Tests for Effects on Multiple Outcomes," Papers 1707.01473, arXiv.org, revised May 2019.
  • Handle: RePEc:arx:papers:1707.01473
    as

    Download full text from publisher

    File URL: http://arxiv.org/pdf/1707.01473
    File Function: Latest version
    Download Restriction: no
    ---><---

    References listed on IDEAS

    as
    1. John A. List & Azeem M. Shaikh & Yang Xu, 2019. "Multiple hypothesis testing in experimental economics," Experimental Economics, Springer;Economic Science Association, vol. 22(4), pages 773-793, December.
    2. Joseph P. Romano & Michael Wolf, 2005. "Stepwise Multiple Testing as Formalized Data Snooping," Econometrica, Econometric Society, vol. 73(4), pages 1237-1282, July.
    3. Jon Kleinberg & Jens Ludwig & Sendhil Mullainathan & Ziad Obermeyer, 2015. "Prediction Policy Problems," American Economic Review, American Economic Association, vol. 105(5), pages 491-495, May.
    4. Sendhil Mullainathan & Jann Spiess, 2017. "Machine Learning: An Applied Econometric Approach," Journal of Economic Perspectives, American Economic Association, vol. 31(2), pages 87-106, Spring.
    5. Raj Chetty & Nathaniel Hendren & Lawrence F. Katz, 2016. "The Effects of Exposure to Better Neighborhoods on Children: New Evidence from the Moving to Opportunity Experiment," American Economic Review, American Economic Association, vol. 106(4), pages 855-902, April.
    6. Arthur S. Goldberger, 1984. "Reverse Regression and Salary Discrimination," Journal of Human Resources, University of Wisconsin Press, vol. 19(3), pages 293-318.
    7. Jeffrey R Kling & Jeffrey B Liebman & Lawrence F Katz, 2007. "Experimental Analysis of Neighborhood Effects," Econometrica, Econometric Society, vol. 75(1), pages 83-119, January.
    Full references (including those not matched with items on IDEAS)

    Citations

    Citations are extracted by the CitEc Project, subscribe to its RSS feed for this item.
    as


    Cited by:

    1. Ahsan Jansson, Cecilia & Patil, Vikram & Vecci, Joe & Chellattan Veettil , Prakashan & Yashodha, Yashodha, 2023. "Locus of Control and Economic Decision-Making: A Field Experiment in Odisha, India," Working Papers in Economics 833, University of Gothenburg, Department of Economics.

    Most related items

    These are the items that most often cite the same works as this one and are cited by the same works as this one.
    1. Hermes, Henning & Lergetporer, Philipp & Peter, Frauke & Wiederhold, Simon, 2021. "Behavioral Barriers and the Socioeconomic Gap in Child Care Enrollment," Discussion Paper Series in Economics 16/2021, Norwegian School of Economics, Department of Economics.
    2. Sophie-Charlotte Klose & Johannes Lederer, 2020. "A Pipeline for Variable Selection and False Discovery Rate Control With an Application in Labor Economics," Papers 2006.12296, arXiv.org, revised Jun 2020.
    3. Hardt, David & Nagler, Markus & Rincke, Johannes, 2023. "Tutoring in (online) higher education: Experimental evidence," Economics of Education Review, Elsevier, vol. 92(C).
    4. McKenzie, David & Sansone, Dario, 2019. "Predicting entrepreneurial success is hard: Evidence from a business plan competition in Nigeria," Journal of Development Economics, Elsevier, vol. 141(C).
    5. Rute M. Caeiro & Pedro C. Vicente, 2020. "Knowledge of vitamin A deficiency and crop adoption: Evidence from a field experiment in Mozambique," Agricultural Economics, International Association of Agricultural Economists, vol. 51(2), pages 175-190, March.
    6. Hermes, Henning & Mierisch, Fabian & Peter, Frauke & Wiederhold, Simon & Lergetporer, Philipp, 2023. "Discrimination on the Child Care Market: A Nationwide Field Experiment," IZA Discussion Papers 16082, Institute of Labor Economics (IZA).
    7. Naguib, Costanza, 2019. "Estimating the Heterogeneous Impact of the Free Movement of Persons on Relative Wage Mobility," Economics Working Paper Series 1903, University of St. Gallen, School of Economics and Political Science.
    8. Grácio, Matilde & Vicente, Pedro C., 2021. "Information, get-out-the-vote messages, and peer influence: Causal effects on political behavior in Mozambique," Journal of Development Economics, Elsevier, vol. 151(C).
    9. Bryan S. Graham, 2018. "Identifying and Estimating Neighborhood Effects," Journal of Economic Literature, American Economic Association, vol. 56(2), pages 450-500, June.
    10. Billings, Stephen B. & Johnson, Erik B., 2012. "A non-parametric test for industrial specialization," Journal of Urban Economics, Elsevier, vol. 71(3), pages 312-331.
    11. Cygan-Rehm, Kamila & Karbownik, Krzysztof, 2022. "The effects of incentivizing early prenatal care on infant health," Journal of Health Economics, Elsevier, vol. 83(C).
    12. Fitzsimons, Emla & Malde, Bansi & Mesnard, Alice & Vera-Hernández, Marcos, 2016. "Nutrition, information and household behavior: Experimental evidence from Malawi," Journal of Development Economics, Elsevier, vol. 122(C), pages 113-126.
    13. Hu, Xiao & Liang, Che-Yuan, 2022. "Does income redistribution prevent residential segregation?," Journal of Economic Behavior & Organization, Elsevier, vol. 193(C), pages 519-542.
    14. Mohit Agrawal & Joseph G. Altonji & Richard K. Mansfield, 2019. "Quantifying Family, School, and Location Effects in the Presence of Complementarities and Sorting," Journal of Labor Economics, University of Chicago Press, vol. 37(S1), pages 11-83.
    15. Hermes, Henning & Krauß, Marina & Lergetporer, Philipp & Peter, Frauke & Wiederhold, Simon, 2022. "Early child care and labor supply of lower-SES mothers: A randomized controlled trial," DICE Discussion Papers 394, Heinrich Heine University Düsseldorf, Düsseldorf Institute for Competition Economics (DICE).
    16. Erik Heilmann & Janosch Henze & Heike Wetzel, 2021. "Machine learning in energy forecasts with an application to high frequency electricity consumption data," MAGKS Papers on Economics 202135, Philipps-Universität Marburg, Faculty of Business Administration and Economics, Department of Economics (Volkswirtschaftliche Abteilung).
    17. Edward L. Glaeser & Scott Duke Kominers & Michael Luca & Nikhil Naik, 2018. "Big Data And Big Cities: The Promises And Limitations Of Improved Measures Of Urban Life," Economic Inquiry, Western Economic Association International, vol. 56(1), pages 114-137, January.
    18. Morris A. Davis & Jesse Gregory & Daniel A. Hartley & Kegon T. K. Tan, 2021. "Neighborhood effects and housing vouchers," Quantitative Economics, Econometric Society, vol. 12(4), pages 1307-1346, November.
    19. Monica Langella & Alan Manning, 2019. "Diversity and Neighbourhood Satisfaction," The Economic Journal, Royal Economic Society, vol. 129(624), pages 3219-3255.
    20. Tranos, Emmanouil & Incera, Andre Carrascal & Willis, George, 2022. "Using the web to predict regional trade flows: data extraction, modelling, and validation," OSF Preprints 9bu5z, Center for Open Science.

    More about this item

    NEP fields

    This paper has been announced in the following NEP Reports:

    Statistics

    Access and download statistics

    Corrections

    All material on this site has been provided by the respective publishers and authors. You can help correct errors and omissions. When requesting a correction, please mention this item's handle: RePEc:arx:papers:1707.01473. See general information about how to correct material in RePEc.

    If you have authored this item and are not yet registered with RePEc, we encourage you to do it here. This allows to link your profile to this item. It also allows you to accept potential citations to this item that we are uncertain about.

    If CitEc recognized a bibliographic reference but did not link an item in RePEc to it, you can help with this form .

    If you know of missing items citing this one, you can help us creating those links by adding the relevant references in the same way as above, for each refering item. If you are a registered author of this item, you may also want to check the "citations" tab in your RePEc Author Service profile, as there may be some citations waiting for confirmation.

    For technical questions regarding this item, or to correct its authors, title, abstract, bibliographic or download information, contact: arXiv administrators (email available below). General contact details of provider: http://arxiv.org/ .

    Please note that corrections may take a couple of weeks to filter through the various RePEc services.

    IDEAS is a RePEc service. RePEc uses bibliographic data supplied by the respective publishers.