IDEAS home Printed from https://ideas.repec.org/p/arx/papers/2511.00727.html

Cross-Validated Causal Inference: a Modern Method to Combine Experimental and Observational Data

Author

Listed:
  • Xuelin Yang
  • Licong Lin
  • Susan Athey
  • Michael I. Jordan
  • Guido W. Imbens

Abstract

We develop new methods to integrate experimental and observational data in causal inference. While randomized controlled trials offer strong internal validity, they are often costly and therefore limited in sample size. Observational data, though cheaper and often with larger sample sizes, are prone to biases due to unmeasured confounders. To harness their complementary strengths, we propose a systematic framework that formulates causal estimation as an empirical risk minimization (ERM) problem. A full model containing the causal parameter is obtained by minimizing a weighted combination of experimental and observational losses--capturing the causal parameter's validity and the full model's fit, respectively. The weight is chosen through cross-validation on the causal parameter across experimental folds. Our experiments on real and synthetic data show the efficacy and reliability of our method. We also provide theoretical non-asymptotic error bounds.

Suggested Citation

  • Xuelin Yang & Licong Lin & Susan Athey & Michael I. Jordan & Guido W. Imbens, 2025. "Cross-Validated Causal Inference: a Modern Method to Combine Experimental and Observational Data," Papers 2511.00727, arXiv.org.
  • Handle: RePEc:arx:papers:2511.00727
    as

    Download full text from publisher

    File URL: http://arxiv.org/pdf/2511.00727
    File Function: Latest version
    Download Restriction: no
    ---><---

    References listed on IDEAS

    as
    1. Guildo W. Imbens, 2003. "Sensitivity to Exogeneity Assumptions in Program Evaluation," American Economic Review, American Economic Association, vol. 93(2), pages 126-132, May.
    2. A. Smith, Jeffrey & E. Todd, Petra, 2005. "Does matching overcome LaLonde's critique of nonexperimental estimators?," Journal of Econometrics, Elsevier, vol. 125(1-2), pages 305-353.
    3. van der Laan Mark J. & Rubin Daniel, 2006. "Targeted Maximum Likelihood Learning," The International Journal of Biostatistics, De Gruyter, vol. 2(1), pages 1-40, December.
    4. Evan T.R. Rosenman & Guillaume Basse & Art B. Owen & Mike Baiocchi, 2023. "Combining observational and experimental datasets using shrinkage estimators," Biometrics, The International Biometric Society, vol. 79(4), pages 2961-2973, December.
    5. James J. Heckman & V. Joseph Hotz & Marcelo Dabos, 1987. "Do We Need Experimental Data To Evaluate the Impact of Manpower Training On Earnings?," Evaluation Review, , vol. 11(4), pages 395-427, August.
    6. Jinyong Hahn, 1998. "On the Role of the Propensity Score in Efficient Semiparametric Estimation of Average Treatment Effects," Econometrica, Econometric Society, vol. 66(2), pages 315-332, March.
    7. Heejung Bang & James M. Robins, 2005. "Doubly Robust Estimation in Missing Data and Causal Inference Models," Biometrics, The International Biometric Society, vol. 61(4), pages 962-973, December.
    8. Xiong, Ruoxuan & Koenecke, Allison & Powell, Michael & Shen, Zhu & Vogelstein, Joshua T. & Athey, Susan, 2021. "Federated Causal Inference in Heterogeneous Observational Data," Research Papers 3990, Stanford University, Graduate School of Business.
    9. George Z. Gui, 2024. "Combining Observational and Experimental Data to Improve Efficiency Using Imperfect Instruments," Marketing Science, INFORMS, vol. 43(2), pages 378-391, March.
    10. Shu Yang & Peng Ding, 2020. "Combining Multiple Observational Data Sources to Estimate Causal Effects," Journal of the American Statistical Association, Taylor & Francis Journals, vol. 115(531), pages 1540-1554, July.
    Full references (including those not matched with items on IDEAS)

    Most related items

    These are the items that most often cite the same works as this one and are cited by the same works as this one.
    1. Luo, Shanshan & Zhang, Yechi & Li, Wei & Geng, Zhi, 2025. "Multiply robust estimation of causal effects using linked data," Computational Statistics & Data Analysis, Elsevier, vol. 209(C).
    2. Iván Díaz & Elizabeth Colantuoni & Daniel F. Hanley & Michael Rosenblum, 2019. "Improved precision in the analysis of randomized trials with survival outcomes, without assuming proportional hazards," Lifetime Data Analysis: An International Journal Devoted to Statistical Methods and Applications for Time-to-Event Data, Springer, vol. 25(3), pages 439-468, July.
    3. Frölich, Markus & Huber, Martin & Wiesenfarth, Manuel, 2017. "The finite sample performance of semi- and non-parametric estimators for treatment effects and policy evaluation," Computational Statistics & Data Analysis, Elsevier, vol. 115(C), pages 91-102.
    4. Richard K. Crump & V. Joseph Hotz & Guido W. Imbens & Oscar A. Mitnik, 2006. "Moving the Goalposts: Addressing Limited Overlap in the Estimation of Average Treatment Effects by Changing the Estimand," NBER Technical Working Papers 0330, National Bureau of Economic Research, Inc.
    5. Farrell, Max H., 2015. "Robust inference on average treatment effects with possibly more covariates than observations," Journal of Econometrics, Elsevier, vol. 189(1), pages 1-23.
    6. Sant’Anna, Pedro H.C. & Zhao, Jun, 2020. "Doubly robust difference-in-differences estimators," Journal of Econometrics, Elsevier, vol. 219(1), pages 101-122.
    7. Zhao, Zhong, 2008. "Sensitivity of propensity score methods to the specifications," Economics Letters, Elsevier, vol. 98(3), pages 309-319, March.
    8. Jianxuan Liu & Yanyuan Ma & Lan Wang, 2018. "An alternative robust estimator of average treatment effect in causal inference," Biometrics, The International Biometric Society, vol. 74(3), pages 910-923, September.
    9. Huber, Martin & Lechner, Michael & Wunsch, Conny, 2013. "The performance of estimators based on the propensity score," Journal of Econometrics, Elsevier, vol. 175(1), pages 1-21.
    10. Huber, Martin & Lechner, Michael & Wunsch, Conny, 2010. "How to Control for Many Covariates? Reliable Estimators Based on the Propensity Score," IZA Discussion Papers 5268, Institute of Labor Economics (IZA).
    11. Marco Caliendo & Sabine Kopeinig, 2008. "Some Practical Guidance For The Implementation Of Propensity Score Matching," Journal of Economic Surveys, Wiley Blackwell, vol. 22(1), pages 31-72, February.
    12. Dettmann, E. & Becker, C. & Schmeißer, C., 2011. "Distance functions for matching in small samples," Computational Statistics & Data Analysis, Elsevier, vol. 55(5), pages 1942-1960, May.
    13. Hairu Wang & Yukun Liu & Haiying Zhou, 2025. "Score test for unconfoundedness under a logistic treatment assignment model," Annals of the Institute of Statistical Mathematics, Springer;The Institute of Statistical Mathematics, vol. 77(4), pages 517-533, August.
    14. Waverly Wei & Maya Petersen & Mark J van der Laan & Zeyu Zheng & Chong Wu & Jingshen Wang, 2023. "Efficient targeted learning of heterogeneous treatment effects for multiple subgroups," Biometrics, The International Biometric Society, vol. 79(3), pages 1934-1946, September.
    15. Michael C. Knaus, 2021. "A double machine learning approach to estimate the effects of musical practice on student’s skills," Journal of the Royal Statistical Society Series A, Royal Statistical Society, vol. 184(1), pages 282-300, January.
    16. Frölich, Markus & Michaelowa, Katharina, 2004. "Peer effects and textbooks in primary education: Evidence from francophone sub-Saharan Africa," HWWA Discussion Papers 311, Hamburg Institute of International Economics (HWWA).
    17. Juan Díaz & Miguel Jaramillo, 2006. "An Evaluation of the Peruvian "Youth Labor Training Program"-PROJOVEN," OVE Working Papers 1006, Inter-American Development Bank, Office of Evaluation and Oversight (OVE).
    18. Paul Frédéric Blanche & Anders Holt & Thomas Scheike, 2023. "On logistic regression with right censored data, with or without competing risks, and its use for estimating treatment effects," Lifetime Data Analysis: An International Journal Devoted to Statistical Methods and Applications for Time-to-Event Data, Springer, vol. 29(2), pages 441-482, April.
    19. Richard K. Crump & V. Joseph Hotz & Guido W. Imbens & Oscar A. Mitnik, 2009. "Dealing with limited overlap in estimation of average treatment effects," Biometrika, Biometrika Trust, vol. 96(1), pages 187-199.
    20. Bernhard Schmidpeter, 2015. "The Fatal Consequences of Grief," CDL Aging, Health, Labor working papers 2015-07, The Christian Doppler (CD) Laboratory Aging, Health, and the Labor Market, Johannes Kepler University Linz, Austria.

    More about this item

    NEP fields

    This paper has been announced in the following NEP Reports:

    Statistics

    Access and download statistics

    Corrections

    All material on this site has been provided by the respective publishers and authors. You can help correct errors and omissions. When requesting a correction, please mention this item's handle: RePEc:arx:papers:2511.00727. See general information about how to correct material in RePEc.

    If you have authored this item and are not yet registered with RePEc, we encourage you to do it here. This allows to link your profile to this item. It also allows you to accept potential citations to this item that we are uncertain about.

    If CitEc recognized a bibliographic reference but did not link an item in RePEc to it, you can help with this form .

    If you know of missing items citing this one, you can help us creating those links by adding the relevant references in the same way as above, for each refering item. If you are a registered author of this item, you may also want to check the "citations" tab in your RePEc Author Service profile, as there may be some citations waiting for confirmation.

    For technical questions regarding this item, or to correct its authors, title, abstract, bibliographic or download information, contact: arXiv administrators (email available below). General contact details of provider: http://arxiv.org/ .

    Please note that corrections may take a couple of weeks to filter through the various RePEc services.

    IDEAS is a RePEc service. RePEc uses bibliographic data supplied by the respective publishers.