IDEAS home Printed from https://ideas.repec.org/p/arx/papers/2006.09676.html
   My bibliography  Save this paper

Combining Experimental and Observational Data to Estimate Treatment Effects on Long Term Outcomes

Author

Listed:
  • Susan Athey
  • Raj Chetty
  • Guido Imbens

Abstract

There has been an increase in interest in experimental evaluations to estimate causal effects, partly because their internal validity tends to be high. At the same time, as part of the big data revolution, large, detailed, and representative, administrative data sets have become more widely available. However, the credibility of estimates of causal effects based on such data sets alone can be low. In this paper, we develop statistical methods for systematically combining experimental and observational data to obtain credible estimates of the causal effect of a binary treatment on a primary outcome that we only observe in the observational sample. Both the observational and experimental samples contain data about a treatment, observable individual characteristics, and a secondary (often short term) outcome. To estimate the effect of a treatment on the primary outcome while addressing the potential confounding in the observational sample, we propose a method that makes use of estimates of the relationship between the treatment and the secondary outcome from the experimental sample. If assignment to the treatment in the observational sample were unconfounded, we would expect the treatment effects on the secondary outcome in the two samples to be similar. We interpret differences in the estimated causal effects on the secondary outcome between the two samples as evidence of unobserved confounders in the observational sample, and develop control function methods for using those differences to adjust the estimates of the treatment effects on the primary outcome. We illustrate these ideas by combining data on class size and third grade test scores from the Project STAR experiment with observational data on class size and both third and eighth grade test scores from the New York school system.

Suggested Citation

  • Susan Athey & Raj Chetty & Guido Imbens, 2020. "Combining Experimental and Observational Data to Estimate Treatment Effects on Long Term Outcomes," Papers 2006.09676, arXiv.org.
  • Handle: RePEc:arx:papers:2006.09676
    as

    Download full text from publisher

    File URL: http://arxiv.org/pdf/2006.09676
    File Function: Latest version
    Download Restriction: no
    ---><---

    References listed on IDEAS

    as
    1. Charles F. Manski, 2013. "Response to the Review of ‘Public Policy in an Uncertain World’," Economic Journal, Royal Economic Society, vol. 0, pages 412-415, August.
    2. Manski, Charles F, 1990. "Nonparametric Bounds on Treatment Effects," American Economic Review, American Economic Association, vol. 80(2), pages 319-323, May.
    3. Joshua D. Angrist & Jörn-Steffen Pischke, 2010. "The Credibility Revolution in Empirical Economics: How Better Research Design Is Taking the Con out of Econometrics," Journal of Economic Perspectives, American Economic Association, vol. 24(2), pages 3-30, Spring.
    4. Angus Deaton, 2010. "Instruments, Randomization, and Learning about Development," Journal of Economic Literature, American Economic Association, vol. 48(2), pages 424-455, June.
    5. Keisuke Hirano & Jack R. Porter, 2009. "Asymptotics for Statistical Treatment Rules," Econometrica, Econometric Society, vol. 77(5), pages 1683-1701, September.
    6. Dehejia, Rajeev H., 2005. "Program evaluation as a decision problem," Journal of Econometrics, Elsevier, vol. 125(1-2), pages 141-173.
    7. David Card, 1990. "The Impact of the Mariel Boatlift on the Miami Labor Market," ILR Review, Cornell University, ILR School, vol. 43(2), pages 245-257, January.
    8. Guido W. Imbens & Whitney K. Newey, 2009. "Identification and Estimation of Triangular Simultaneous Equations Models Without Additivity," Econometrica, Econometric Society, vol. 77(5), pages 1481-1512, September.
    9. Rachel Glennerster & Kudzai Takavarasha, 2013. "Running Randomized Evaluations: A Practical Guide," Economics Books, Princeton University Press, edition 1, number 10085.
    10. Manski, Charles F., 2013. "Public Policy in an Uncertain World: Analysis and Decisions," Economics Books, Harvard University Press, number 9780674066892, Spring.
    11. Heckman, James, 2013. "Sample selection bias as a specification error," Applied Econometrics, Russian Presidential Academy of National Economy and Public Administration (RANEPA), vol. 31(3), pages 129-137.
    12. Alberto Abadie & Guido W. Imbens, 2016. "Matching on the Estimated Propensity Score," Econometrica, Econometric Society, vol. 84, pages 781-807, March.
    13. Charles F. Manski, 2004. "Statistical Treatment Rules for Heterogeneous Populations," Econometrica, Econometric Society, vol. 72(4), pages 1221-1246, July.
    14. Susan Athey & Guido W. Imbens, 2006. "Identification and Inference in Nonlinear Difference-in-Differences Models," Econometrica, Econometric Society, vol. 74(2), pages 431-497, March.
    15. Raj Chetty, 2009. "Sufficient Statistics for Welfare Analysis: A Bridge Between Structural and Reduced-Form Methods," Annual Review of Economics, Annual Reviews, vol. 1(1), pages 451-488, May.
    16. Jeffrey M. Wooldridge, 2015. "Control Function Methods in Applied Econometrics," Journal of Human Resources, University of Wisconsin Press, vol. 50(2), pages 420-445.
    17. Imbens,Guido W. & Rubin,Donald B., 2015. "Causal Inference for Statistics, Social, and Biomedical Sciences," Cambridge Books, Cambridge University Press, number 9780521885881.
    18. Joseph Hotz, V. & Imbens, Guido W. & Mortimer, Julie H., 2005. "Predicting the efficacy of future training programs using past experiences at other locations," Journal of Econometrics, Elsevier, vol. 125(1-2), pages 241-270.
    Full references (including those not matched with items on IDEAS)

    Citations

    Citations are extracted by the CitEc Project, subscribe to its RSS feed for this item.
    as


    Cited by:

    1. Jiafeng Chen & David M. Ritzwoller, 2021. "Semiparametric Estimation of Long-Term Treatment Effects," Papers 2107.14405, arXiv.org.

    Most related items

    These are the items that most often cite the same works as this one and are cited by the same works as this one.
    1. Guido W. Imbens, 2020. "Potential Outcome and Directed Acyclic Graph Approaches to Causality: Relevance for Empirical Practice in Economics," Journal of Economic Literature, American Economic Association, vol. 58(4), pages 1129-1179, December.
    2. Susan Athey & Guido W. Imbens, 2017. "The State of Applied Econometrics: Causality and Policy Evaluation," Journal of Economic Perspectives, American Economic Association, vol. 31(2), pages 3-32, Spring.
    3. Guido W. Imbens & Jeffrey M. Wooldridge, 2009. "Recent Developments in the Econometrics of Program Evaluation," Journal of Economic Literature, American Economic Association, vol. 47(1), pages 5-86, March.
    4. Alex Eble & Peter Boone & Diana Elbourne, 2017. "On Minimizing the Risk of Bias in Randomized Controlled Trials in Economics," World Bank Economic Review, World Bank Group, vol. 31(3), pages 687-707.
    5. Jeffrey Smith & Arthur Sweetman, 2016. "Viewpoint: Estimating the causal effects of policies and programs," Canadian Journal of Economics, Canadian Economics Association, vol. 49(3), pages 871-905, August.
    6. Deaton, Angus & Cartwright, Nancy, 2018. "Understanding and misunderstanding randomized controlled trials," Social Science & Medicine, Elsevier, vol. 210(C), pages 2-21.
    7. Bryan S. Graham & Guido W. Imbens & Geert Ridder, 2014. "Complementarity and aggregate implications of assortative matching: A nonparametric analysis," Quantitative Economics, Econometric Society, vol. 5, pages 29-66, March.
    8. Bo, Hao & Galiani, Sebastian, 2021. "Assessing external validity," Research in Economics, Elsevier, vol. 75(3), pages 274-285.
    9. Martin Huber, 2019. "An introduction to flexible methods for policy evaluation," Papers 1910.00641, arXiv.org.
    10. Alberto Abadie & Susan Athey & Guido W. Imbens & Jeffrey M. Wooldridge, 2020. "Sampling‐Based versus Design‐Based Uncertainty in Regression Analysis," Econometrica, Econometric Society, vol. 88(1), pages 265-296, January.
    11. W. Bentley MacLeod, 2017. "Viewpoint: The human capital approach to inference," Canadian Journal of Economics, Canadian Economics Association, vol. 50(1), pages 5-39, February.
    12. Davide Viviano, 2019. "Policy Targeting under Network Interference," Papers 1906.10258, arXiv.org, revised Jun 2021.
    13. Alfredo Di Tillio & Marco Ottaviani & Peter Norman Sørensen, 2017. "Persuasion Bias in Science: Can Economics Help?," Economic Journal, Royal Economic Society, vol. 127(605), pages 266-304, October.
    14. Susan Athey & Stefan Wager, 2017. "Policy Learning with Observational Data," Papers 1702.02896, arXiv.org, revised Sep 2020.
    15. Susan Athey & Guido Imbens, 2016. "The Econometrics of Randomized Experiments," Papers 1607.00698, arXiv.org.
    16. Committee, Nobel Prize, 2021. "Answering causal questions using observational data," Nobel Prize in Economics documents 2021-2, Nobel Prize Committee.
    17. Rahul Singh & Liyuan Xu & Arthur Gretton, 2020. "Generalized Kernel Ridge Regression for Nonparametric Structural Functions and Semiparametric Treatment Effects," Papers 2010.04855, arXiv.org, revised Dec 2021.
    18. Juliano Assunção & Robert McMillan & Joshua Murphy & Eduardo Souza-Rodrigues, 2019. "Optimal Environmental Targeting in the Amazon Rainforest," NBER Working Papers 25636, National Bureau of Economic Research, Inc.
    19. van der Klaauw, Bas, 2014. "From micro data to causality: Forty years of empirical labor economics," Labour Economics, Elsevier, vol. 30(C), pages 88-97.
    20. Alberto Abadie & Susan Athey & Guido W. Imbens & Jeffrey M. Wooldridge, 2017. "Sampling-based vs. Design-based Uncertainty in Regression Analysis," Papers 1706.01778, arXiv.org, revised Jun 2019.

    More about this item

    NEP fields

    This paper has been announced in the following NEP Reports:

    Statistics

    Access and download statistics

    Corrections

    All material on this site has been provided by the respective publishers and authors. You can help correct errors and omissions. When requesting a correction, please mention this item's handle: RePEc:arx:papers:2006.09676. See general information about how to correct material in RePEc.

    For technical questions regarding this item, or to correct its authors, title, abstract, bibliographic or download information, contact: . General contact details of provider: http://arxiv.org/ .

    If you have authored this item and are not yet registered with RePEc, we encourage you to do it here. This allows to link your profile to this item. It also allows you to accept potential citations to this item that we are uncertain about.

    If CitEc recognized a bibliographic reference but did not link an item in RePEc to it, you can help with this form .

    If you know of missing items citing this one, you can help us creating those links by adding the relevant references in the same way as above, for each refering item. If you are a registered author of this item, you may also want to check the "citations" tab in your RePEc Author Service profile, as there may be some citations waiting for confirmation.

    For technical questions regarding this item, or to correct its authors, title, abstract, bibliographic or download information, contact: arXiv administrators (email available below). General contact details of provider: http://arxiv.org/ .

    Please note that corrections may take a couple of weeks to filter through the various RePEc services.

    IDEAS is a RePEc service hosted by the Research Division of the Federal Reserve Bank of St. Louis . RePEc uses bibliographic data supplied by the respective publishers.