IDEAS home Printed from https://ideas.repec.org/p/arx/papers/1603.09326.html
   My bibliography  Save this paper

Estimating Treatment Effects using Multiple Surrogates: The Role of the Surrogate Score and the Surrogate Index

Author

Listed:
  • Susan Athey
  • Raj Chetty
  • Guido Imbens
  • Hyunseung Kang

Abstract

Estimating the long-term effects of treatments is of interest in many fields. A common challenge in estimating such treatment effects is that long-term outcomes are unobserved in the time frame needed to make policy decisions. One approach to overcome this missing data problem is to analyze treatments effects on an intermediate outcome, often called a statistical surrogate, if it satisfies the condition that treatment and outcome are independent conditional on the statistical surrogate. The validity of the surrogacy condition is often controversial. Here we exploit that fact that in modern datasets, researchers often observe a large number, possibly hundreds or thousands, of intermediate outcomes, thought to lie on or close to the causal chain between the treatment and the long-term outcome of interest. Even if none of the individual proxies satisfies the statistical surrogacy criterion by itself, using multiple proxies can be useful in causal inference. We focus primarily on a setting with two samples, an experimental sample containing data about the treatment indicator and the surrogates and an observational sample containing information about the surrogates and the primary outcome. We state assumptions under which the average treatment effect be identified and estimated with a high-dimensional vector of proxies that collectively satisfy the surrogacy assumption, and derive the bias from violations of the surrogacy assumption, and show that even if the primary outcome is also observed in the experimental sample, there is still information to be gained from using surrogates.

Suggested Citation

  • Susan Athey & Raj Chetty & Guido Imbens & Hyunseung Kang, 2016. "Estimating Treatment Effects using Multiple Surrogates: The Role of the Surrogate Score and the Surrogate Index," Papers 1603.09326, arXiv.org, revised Apr 2024.
  • Handle: RePEc:arx:papers:1603.09326
    as

    Download full text from publisher

    File URL: http://arxiv.org/pdf/1603.09326
    File Function: Latest version
    Download Restriction: no
    ---><---

    References listed on IDEAS

    as
    1. Raj Chetty & John N. Friedman & Nathaniel Hilger & Emmanuel Saez & Diane Whitmore Schanzenbach & Danny Yagan, 2011. "How Does Your Kindergarten Classroom Affect Your Earnings? Evidence from Project Star," The Quarterly Journal of Economics, President and Fellows of Harvard College, vol. 126(4), pages 1593-1660.
    2. Alberto Abadie & Guido W. Imbens, 2016. "Matching on the Estimated Propensity Score," Econometrica, Econometric Society, vol. 84, pages 781-807, March.
    3. Bryan S. Graham & Cristine Campos De Xavier Pinto & Daniel Egel, 2012. "Inverse Probability Tilting for Moment Condition Models with Missing Data," The Review of Economic Studies, Review of Economic Studies Ltd, vol. 79(3), pages 1053-1079.
    4. Alexandre Belloni & Victor Chernozhukov & Christian Hansen, 2014. "Inference on Treatment Effects after Selection among High-Dimensional Controlsâ€," The Review of Economic Studies, Review of Economic Studies Ltd, vol. 81(2), pages 608-650.
    5. Keisuke Hirano & Guido W. Imbens & Geert Ridder, 2003. "Efficient Estimation of Average Treatment Effects Using the Estimated Propensity Score," Econometrica, Econometric Society, vol. 71(4), pages 1161-1189, July.
    6. Bryan S. Graham & Cristine Campos de Xavier Pinto & Daniel Egel, 2016. "Efficient Estimation of Data Combination Models by the Method of Auxiliary-to-Study Tilting (AST)," Journal of Business & Economic Statistics, Taylor & Francis Journals, vol. 34(2), pages 288-301, April.
    7. Mark van der Laan & Maya Petersen, 2004. "Estimation of Direct and Indirect Causal Effects in Longitudinal Studies," U.C. Berkeley Division of Biostatistics Working Paper Series 1155, Berkeley Electronic Press.
    8. Ben B. Hansen, 2008. "The prognostic analogue of the propensity score," Biometrika, Biometrika Trust, vol. 95(2), pages 481-488.
    9. Susan Athey & Scott Stern, 2002. "The Impact of Information Technology on Emergency Health Care Outcomes," RAND Journal of Economics, The RAND Corporation, vol. 33(3), pages 399-432, Autumn.
    10. Ridder, Geert & Moffitt, Robert, 2007. "The Econometrics of Data Combination," Handbook of Econometrics, in: J.J. Heckman & E.E. Leamer (ed.), Handbook of Econometrics, edition 1, volume 6, chapter 75, Elsevier.
    11. Rubin, Donald B, 1986. "Statistical Matching Using File Concatenation with Adjusted Weights and Multiple Imputations," Journal of Business & Economic Statistics, American Statistical Association, vol. 4(1), pages 87-94, January.
    12. C. B. Begg & D. H. Y. Leung, 2000. "On the use of surrogate end points in randomized trials," Journal of the Royal Statistical Society Series A, Royal Statistical Society, vol. 163(1), pages 15-28.
    13. Imbens,Guido W. & Rubin,Donald B., 2015. "Causal Inference for Statistics, Social, and Biomedical Sciences," Cambridge Books, Cambridge University Press, number 9780521885881.
    14. Alberto Abadie & Guido W. Imbens, 2006. "Large Sample Properties of Matching Estimators for Average Treatment Effects," Econometrica, Econometric Society, vol. 74(1), pages 235-267, January.
    Full references (including those not matched with items on IDEAS)

    Citations

    Citations are extracted by the CitEc Project, subscribe to its RSS feed for this item.
    as


    Cited by:

    1. Guido W. Imbens, 2020. "Potential Outcome and Directed Acyclic Graph Approaches to Causality: Relevance for Empirical Practice in Economics," Journal of Economic Literature, American Economic Association, vol. 58(4), pages 1129-1179, December.
    2. Jiafeng Chen & David M. Ritzwoller, 2021. "Semiparametric Estimation of Long-Term Treatment Effects," Papers 2107.14405, arXiv.org, revised Aug 2023.
    3. Keith Battocchi & Eleanor Dillon & Maggie Hei & Greg Lewis & Miruna Oprescu & Vasilis Syrgkanis, 2021. "Estimating the Long-Term Effects of Novel Treatments," Papers 2103.08390, arXiv.org, revised Feb 2022.
    4. John Mullahy, 2018. "Treatment Effects with Multiple Outcomes," NBER Working Papers 25307, National Bureau of Economic Research, Inc.
    5. Carlos Fernández-Loría & Foster Provost, 2022. "Causal Decision Making and Causal Effect Estimation Are Not the Same…and Why It Matters," INFORMS Joural on Data Science, INFORMS, vol. 1(1), pages 4-16, April.
    6. Rahul Singh, 2022. "Generalized Kernel Ridge Regression for Long Term Causal Inference: Treatment Effects, Dose Responses, and Counterfactual Distributions," Papers 2201.05139, arXiv.org.
    7. Isaac Meza & Rahul Singh, 2021. "Nested Nonparametric Instrumental Variable Regression: Long Term, Mediated, and Time Varying Treatment Effects," Papers 2112.14249, arXiv.org, revised Mar 2024.

    Most related items

    These are the items that most often cite the same works as this one and are cited by the same works as this one.
    1. Susan Athey & Guido W. Imbens & Stefan Wager, 2018. "Approximate residual balancing: debiased inference of average treatment effects in high dimensions," Journal of the Royal Statistical Society Series B, Royal Statistical Society, vol. 80(4), pages 597-623, September.
    2. Graham, Bryan S. & Pinto, Cristine Campos de Xavier, 2022. "Semiparametrically efficient estimation of the average linear regression function," Journal of Econometrics, Elsevier, vol. 226(1), pages 115-138.
    3. Arun Advani & Toru Kitagawa & Tymon Słoczyński, 2019. "Mostly harmless simulations? Using Monte Carlo studies for estimator selection," Journal of Applied Econometrics, John Wiley & Sons, Ltd., vol. 34(6), pages 893-910, September.
    4. Huber, Martin, 2019. "An introduction to flexible methods for policy evaluation," FSES Working Papers 504, Faculty of Economics and Social Sciences, University of Freiburg/Fribourg Switzerland.
    5. Huber, Martin & Lechner, Michael & Wunsch, Conny, 2013. "The performance of estimators based on the propensity score," Journal of Econometrics, Elsevier, vol. 175(1), pages 1-21.
    6. Kitagawa, Toru & Muris, Chris, 2016. "Model averaging in semiparametric estimation of treatment effects," Journal of Econometrics, Elsevier, vol. 193(1), pages 271-289.
    7. Michael C. Knaus, 2021. "A double machine learning approach to estimate the effects of musical practice on student’s skills," Journal of the Royal Statistical Society Series A, Royal Statistical Society, vol. 184(1), pages 282-300, January.
    8. Toru Kitagawa & Chris Muris, 2013. "Covariate selection and model averaging in semiparametric estimation of treatment effects," CeMMAP working papers 61/13, Institute for Fiscal Studies.
    9. Difang Huang & Jiti Gao & Tatsushi Oka, 2022. "Semiparametric Single-Index Estimation for Average Treatment Effects," Papers 2206.08503, arXiv.org, revised Apr 2024.
    10. Susan Athey & Guido W. Imbens, 2017. "The State of Applied Econometrics: Causality and Policy Evaluation," Journal of Economic Perspectives, American Economic Association, vol. 31(2), pages 3-32, Spring.
    11. Bryan S. Graham & Guido W. Imbens & Geert Ridder, 2020. "Identification and Efficiency Bounds for the Average Match Function Under Conditionally Exogenous Matching," Journal of Business & Economic Statistics, Taylor & Francis Journals, vol. 38(2), pages 303-316, April.
    12. Victor Chernozhukov & Denis Chetverikov & Mert Demirer & Esther Duflo & Christian Hansen & Whitney Newey & James Robins, 2018. "Double/debiased machine learning for treatment and structural parameters," Econometrics Journal, Royal Economic Society, vol. 21(1), pages 1-68, February.
    13. Susan Athey & Raj Chetty & Guido W. Imbens & Hyunseung Kang, 2019. "The Surrogate Index: Combining Short-Term Proxies to Estimate Long-Term Treatment Effects More Rapidly and Precisely," NBER Working Papers 26463, National Bureau of Economic Research, Inc.
    14. Shengfang Tang & Zongwu Cai & Ying Fang & Ming Lin, 2020. "A New Quantile Treatment Effect Model for Studying Smoking Effect on Birth Weight During Mother's Pregnancy," WORKING PAPERS SERIES IN THEORETICAL AND APPLIED ECONOMICS 202003, University of Kansas, Department of Economics, revised Feb 2020.
    15. Lee, Ying-Ying, 2018. "Efficient propensity score regression estimators of multivalued treatment effects for the treated," Journal of Econometrics, Elsevier, vol. 204(2), pages 207-222.
    16. Bryan S. Graham & Guido Imbens & Geert Ridder, 2016. "Identification and efficiency bounds for the average match function under conditionally exogenous matching," CeMMAP working papers 10/16, Institute for Fiscal Studies.
    17. Adusumilli, Karun & Otsu, Taisuke & Qiu, Chen, 2023. "Reweighted nonparametric likelihood inference for linear functionals," LSE Research Online Documents on Economics 120198, London School of Economics and Political Science, LSE Library.
    18. Nikolay Doudchenko & Guido W. Imbens, 2016. "Balancing, Regression, Difference-In-Differences and Synthetic Control Methods: A Synthesis," NBER Working Papers 22791, National Bureau of Economic Research, Inc.
    19. Karun Adusumilli & Taisuke Otsu, 2018. "Likelihood ratio inference for missing data models," STICERD - Econometrics Paper Series 599, Suntory and Toyota International Centres for Economics and Related Disciplines, LSE.
    20. Shu Yang & Yunshu Zhang, 2023. "Multiply robust matching estimators of average and quantile treatment effects," Scandinavian Journal of Statistics, Danish Society for Theoretical Statistics;Finnish Statistical Society;Norwegian Statistical Association;Swedish Statistical Association, vol. 50(1), pages 235-265, March.

    More about this item

    NEP fields

    This paper has been announced in the following NEP Reports:

    Statistics

    Access and download statistics

    Corrections

    All material on this site has been provided by the respective publishers and authors. You can help correct errors and omissions. When requesting a correction, please mention this item's handle: RePEc:arx:papers:1603.09326. See general information about how to correct material in RePEc.

    If you have authored this item and are not yet registered with RePEc, we encourage you to do it here. This allows to link your profile to this item. It also allows you to accept potential citations to this item that we are uncertain about.

    If CitEc recognized a bibliographic reference but did not link an item in RePEc to it, you can help with this form .

    If you know of missing items citing this one, you can help us creating those links by adding the relevant references in the same way as above, for each refering item. If you are a registered author of this item, you may also want to check the "citations" tab in your RePEc Author Service profile, as there may be some citations waiting for confirmation.

    For technical questions regarding this item, or to correct its authors, title, abstract, bibliographic or download information, contact: arXiv administrators (email available below). General contact details of provider: http://arxiv.org/ .

    Please note that corrections may take a couple of weeks to filter through the various RePEc services.

    IDEAS is a RePEc service. RePEc uses bibliographic data supplied by the respective publishers.