IDEAS home Printed from https://ideas.repec.org/p/arx/papers/2503.07811.html
   My bibliography  Save this paper

A primer on optimal transport for causal inference with observational data

Author

Listed:
  • Florian F Gunsilius

Abstract

The theory of optimal transportation has developed into a powerful and elegant framework for comparing probability distributions, with wide-ranging applications in all areas of science. The fundamental idea of analyzing probabilities by comparing their underlying state space naturally aligns with the core idea of causal inference, where understanding and quantifying counterfactual states is paramount. Despite this intuitive connection, explicit research at the intersection of optimal transport and causal inference is only beginning to develop. Yet, many foundational models in causal inference have implicitly relied on optimal transport principles for decades, without recognizing the underlying connection. Therefore, the goal of this review is to offer an introduction to the surprisingly deep existing connections between optimal transport and the identification of causal effects with observational data -- where optimal transport is not just a set of potential tools, but actually builds the foundation of model assumptions. As a result, this review is intended to unify the language and notation between different areas of statistics, mathematics, and econometrics, by pointing out these existing connections, and to explore novel problems and directions for future work in both areas derived from this realization.

Suggested Citation

  • Florian F Gunsilius, 2025. "A primer on optimal transport for causal inference with observational data," Papers 2503.07811, arXiv.org, revised Mar 2025.
  • Handle: RePEc:arx:papers:2503.07811
    as

    Download full text from publisher

    File URL: http://arxiv.org/pdf/2503.07811
    File Function: Latest version
    Download Restriction: no
    ---><---

    References listed on IDEAS

    as
    1. Guido W. Imbens, 2020. "Potential Outcome and Directed Acyclic Graph Approaches to Causality: Relevance for Empirical Practice in Economics," Journal of Economic Literature, American Economic Association, vol. 58(4), pages 1129-1179, December.
    2. Manski, Charles F, 1990. "Nonparametric Bounds on Treatment Effects," American Economic Review, American Economic Association, vol. 80(2), pages 319-323, May.
    3. James J. Heckman, 2001. "Micro Data, Heterogeneity, and the Evaluation of Public Policy: Nobel Lecture," Journal of Political Economy, University of Chicago Press, vol. 109(4), pages 673-748, August.
    4. Alfred Galichon & Ivar Ekeland & Marc Henry, 2009. "Comonotonic measures of multivariates risks," Working Papers hal-00401828, HAL.
    5. Mathias Beiglbock & Benjamin Jourdain & William Margheriti & Gudmund Pammer, 2021. "Stability of the Weak Martingale Optimal Transport Problem," Papers 2109.06322, arXiv.org, revised Apr 2022.
    6. Olli Ropponen, 2011. "Reconciling the evidence of Card and Krueger (1994) and Neumark and Wascher (2000)," Journal of Applied Econometrics, John Wiley & Sons, Ltd., vol. 26(6), pages 1051-1057, September.
    7. repec:spo:wpecon:info:hdl:2441/5rkqqmvrn4tl22s9mc4b1h6b4 is not listed on IDEAS
    8. Victor Chernozhukov & Alfred Galichon & Marc Henry & Brendan Pass, 2021. "Identification of Hedonic Equilibrium and Nonseparable Simultaneous Equations," Journal of Political Economy, University of Chicago Press, vol. 129(3), pages 842-870.
    9. F F Gunsilius, 2021. "Nontestability of instrument validity under continuous treatments [Identification of causal effects using instrumental variables]," Biometrika, Biometrika Trust, vol. 108(4), pages 989-995.
    10. Victor Chernozhukov & Alfred Galichon & Marc Hallin & Marc Henry, 2014. "Monge-Kantorovich Depth, Quantiles, Ranks, and Signs," Papers 1412.8434, arXiv.org, revised Sep 2015.
    11. Lu Zhang & Xiaomeng Zhang & Xinyu Zhang, 2024. "Asymptotic Properties of the Distributional Synthetic Controls," Papers 2405.00953, arXiv.org, revised Aug 2024.
    12. Card, David & Krueger, Alan B, 1994. "Minimum Wages and Employment: A Case Study of the Fast-Food Industry in New Jersey and Pennsylvania," American Economic Review, American Economic Association, vol. 84(4), pages 772-793, September.
    13. Florian Gunsilius & Susanne Schennach, 2023. "Independent Nonlinear Component Analysis," Journal of the American Statistical Association, Taylor & Francis Journals, vol. 118(542), pages 1305-1318, April.
    14. Zheng Fang & Andres Santos & Azeem M. Shaikh & Alexander Torgovitsky, 2023. "Inference for Large‐Scale Linear Systems With Known Coefficients," Econometrica, Econometric Society, vol. 91(1), pages 299-327, January.
    15. Guido W. Imbens & Whitney K. Newey, 2009. "Identification and Estimation of Triangular Simultaneous Equations Models Without Additivity," Econometrica, Econometric Society, vol. 77(5), pages 1481-1512, September.
    16. Dmitry Arkhangelsky & Susan Athey & David A. Hirshberg & Guido W. Imbens & Stefan Wager, 2021. "Synthetic Difference-in-Differences," American Economic Review, American Economic Association, vol. 111(12), pages 4088-4118, December.
    17. Hainmueller, Jens, 2012. "Entropy Balancing for Causal Effects: A Multivariate Reweighting Method to Produce Balanced Samples in Observational Studies," Political Analysis, Cambridge University Press, vol. 20(1), pages 25-46, January.
    18. James J. Heckman & Edward Vytlacil, 2005. "Structural Equations, Treatment Effects, and Econometric Policy Evaluation," Econometrica, Econometric Society, vol. 73(3), pages 669-738, May.
    19. repec:spo:wpmain:info:hdl:2441/4c5431jp6o888pdrcs0fuirl40 is not listed on IDEAS
    20. Kitagawa, Toru, 2021. "The identification region of the potential outcome distributions under instrument independence," Journal of Econometrics, Elsevier, vol. 225(2), pages 231-253.
    21. F. F. Gunsilius, 2023. "Distributional Synthetic Controls," Econometrica, Econometric Society, vol. 91(3), pages 1105-1117, May.
    22. Rui Gao & Anton Kleywegt, 2023. "Distributionally Robust Stochastic Optimization with Wasserstein Distance," Mathematics of Operations Research, INFORMS, vol. 48(2), pages 603-655, May.
    23. Alfred Galichon & Bernard Salanié, 2010. "Matching with Trade-offs: Revealed Preferences over Competiting Characteristics," Working Papers hal-00473173, HAL.
    24. James J. Heckman & Jeffrey Smith & Nancy Clements, 1997. "Making The Most Out Of Programme Evaluations and Social Experiments: Accounting For Heterogeneity in Programme Impacts," The Review of Economic Studies, Review of Economic Studies Ltd, vol. 64(4), pages 487-535.
    25. James J. Heckman & Jeffrey A. Smith, 1995. "Assessing the Case for Social Experiments," Journal of Economic Perspectives, American Economic Association, vol. 9(2), pages 85-110, Spring.
    26. Marcel Klatt & Axel Munk & Yoav Zemel, 2022. "Limit laws for empirical optimal solutions in random linear programs," Annals of Operations Research, Springer, vol. 315(1), pages 251-278, August.
    27. Alberto Abadie & Alexis Diamond & Jens Hainmueller, 2015. "Comparative Politics and the Synthetic Control Method," American Journal of Political Science, John Wiley & Sons, vol. 59(2), pages 495-510, February.
    28. repec:hal:wpspec:info:hdl:2441/5rkqqmvrn4tl22s9mc4b1h6b4 is not listed on IDEAS
    29. Alberto Abadie & Javier Gardeazabal, 2003. "The Economic Costs of Conflict: A Case Study of the Basque Country," American Economic Review, American Economic Association, vol. 93(1), pages 113-132, March.
    30. Susan Athey & Guido W. Imbens, 2006. "Identification and Inference in Nonlinear Difference-in-Differences Models," Econometrica, Econometric Society, vol. 74(2), pages 431-497, March.
    31. William Torous & Florian Gunsilius & Philippe Rigollet, 2021. "An Optimal Transport Approach to Estimating Causal Effects via Nonlinear Difference-in-Differences," Papers 2108.05858, arXiv.org, revised Mar 2024.
    32. Alberto Abadie, 2021. "Using Synthetic Controls: Feasibility, Data Requirements, and Methodological Aspects," Journal of Economic Literature, American Economic Association, vol. 59(2), pages 391-425, June.
    33. Mas-Colell, Andreu & Whinston, Michael D. & Green, Jerry R., 1995. "Microeconomic Theory," OUP Catalogue, Oxford University Press, number 9780195102680, Decembrie.
    34. repec:hal:spmain:info:hdl:2441/64itsev5509q8aa5mrbhi0g0b6 is not listed on IDEAS
    35. Kosuke Imai & Marc Ratkovic, 2014. "Covariate balancing propensity score," Journal of the Royal Statistical Society Series B, Royal Statistical Society, vol. 76(1), pages 243-263, January.
    36. Jonathan Roth & Pedro H. C. Sant'Anna, 2023. "When Is Parallel Trends Sensitive to Functional Form?," Econometrica, Econometric Society, vol. 91(2), pages 737-747, March.
    37. repec:hal:spmain:info:hdl:2441/4c5431jp6o888pdrcs0fuirl40 is not listed on IDEAS
    38. repec:dau:papers:123456789/2278 is not listed on IDEAS
    39. Joshua D. Angrist & Jörn-Steffen Pischke, 2009. "Mostly Harmless Econometrics: An Empiricist's Companion," Economics Books, Princeton University Press, edition 1, number 8769.
    40. repec:spo:wpmain:info:hdl:2441/64itsev5509q8aa5mrbhi0g0b6 is not listed on IDEAS
    41. Stéphane Bonhomme & Ulrich Sauder, 2011. "Recovering Distributions in Difference-in-Differences Models: A Comparison of Selective and Comprehensive Schooling," The Review of Economics and Statistics, MIT Press, vol. 93(2), pages 479-494, May.
    42. James J. Heckman & Jeffrey A. Smith, 1999. "The Pre-Program Earnings Dip and the Determinants of Participation in a Social Program: Implications for Simple Program Evaluation Strategies," NBER Working Papers 6983, National Bureau of Economic Research, Inc.
    43. Rosa L. Matzkin, 2003. "Nonparametric Estimation of Nonadditive Random Functions," Econometrica, Econometric Society, vol. 71(5), pages 1339-1375, September.
    44. Stock, James H & Wright, Jonathan H & Yogo, Motohiro, 2002. "A Survey of Weak Instruments and Weak Identification in Generalized Method of Moments," Journal of Business & Economic Statistics, American Statistical Association, vol. 20(4), pages 518-529, October.
    45. Thomas M. Russell, 2021. "Sharp Bounds on Functionals of the Joint Distribution in the Analysis of Treatment Effects," Journal of Business & Economic Statistics, Taylor & Francis Journals, vol. 39(2), pages 532-546, March.
    46. Brantly Callaway & Tong Li, 2019. "Quantile treatment effects in difference in differences models with panel data," Quantitative Economics, Econometric Society, vol. 10(4), pages 1579-1618, November.
    47. repec:hal:spmain:info:hdl:2441/1293p84sf58s482v2dpn0gsd67 is not listed on IDEAS
    48. William Wascher & David Neumark, 2000. "Minimum Wages and Employment: A Case Study of the Fast-Food Industry in New Jersey and Pennsylvania: Comment," American Economic Review, American Economic Association, vol. 90(5), pages 1362-1396, December.
    49. Stefan Hoderlein & Enno Mammen, 2007. "Identification of Marginal Effects in Nonseparable Models Without Monotonicity," Econometrica, Econometric Society, vol. 75(5), pages 1513-1518, September.
    50. Abadie, Alberto & Diamond, Alexis & Hainmueller, Jens, 2010. "Synthetic Control Methods for Comparative Case Studies: Estimating the Effect of California’s Tobacco Control Program," Journal of the American Statistical Association, American Statistical Association, vol. 105(490), pages 493-505.
    51. repec:hal:spmain:info:hdl:2441/5rkqqmvrn4tl22s9mc4b1h6b4 is not listed on IDEAS
    52. Heckman, James J, 1990. "Varieties of Selection Bias," American Economic Review, American Economic Association, vol. 80(2), pages 313-318, May.
    53. Imbens,Guido W. & Rubin,Donald B., 2015. "Causal Inference for Statistics, Social, and Biomedical Sciences," Cambridge Books, Cambridge University Press, number 9780521885881, December.
    54. Fan, Yanqin & Park, Sang Soo, 2010. "Sharp Bounds On The Distribution Of Treatment Effects And Their Statistical Inference," Econometric Theory, Cambridge University Press, vol. 26(3), pages 931-951, June.
    55. Victor Chernozhukov & Alfred Galichon & Marc Hallin & Marc Henry, 2014. "Monge-Kantorovich Depth, Quantiles, Ranks, and Signs," Papers 1412.8434, arXiv.org, revised Sep 2015.
    56. Stefan Hoderlein & Hajo Holzmann & Maximilian Kasy & Alexander Meister, 2017. "Corrigendum: Instrumental Variables with Unrestricted Heterogeneity and Continuous Treatment," The Review of Economic Studies, Review of Economic Studies Ltd, vol. 84(2), pages 964-968.
    57. repec:spo:wpmain:info:hdl:2441/1293p84sf58s482v2dpn0gsd67 is not listed on IDEAS
    58. Roth, Jonathan & Sant’Anna, Pedro H.C. & Bilinski, Alyssa & Poe, John, 2023. "What’s trending in difference-in-differences? A synthesis of the recent econometrics literature," Journal of Econometrics, Elsevier, vol. 235(2), pages 2218-2244.
    59. Gunsilius, Florian F., 2023. "A condition for the identification of multivariate models with binary instruments," Journal of Econometrics, Elsevier, vol. 235(1), pages 220-238.
    60. Jose Blanchet & Karthyek Murthy, 2019. "Quantifying Distributional Model Risk via Optimal Transport," Mathematics of Operations Research, INFORMS, vol. 44(2), pages 565-600, May.
    61. Heckman, James J & Smith, Jeffrey A, 1999. "The Pre-programme Earnings Dip and the Determinants of Participation in a Social Programme. Implications for Simple Programme Evaluation Strategies," Economic Journal, Royal Economic Society, vol. 109(457), pages 313-348, July.
    62. David Van Dijcke & Florian Gunsilius & Austin Wright, 2024. "Return to Office and the Tenure Distribution," Papers 2405.04352, arXiv.org.
    63. repec:spo:wpmain:info:hdl:2441/5rkqqmvrn4tl22s9mc4b1h6b4 is not listed on IDEAS
    64. Alexander Torgovitsky, 2015. "Identification of Nonseparable Models Using Instruments With Small Support," Econometrica, Econometric Society, vol. 83(3), pages 1185-1197, May.
    Full references (including those not matched with items on IDEAS)

    Most related items

    These are the items that most often cite the same works as this one and are cited by the same works as this one.
    1. Guido W. Imbens & Jeffrey M. Wooldridge, 2009. "Recent Developments in the Econometrics of Program Evaluation," Journal of Economic Literature, American Economic Association, vol. 47(1), pages 5-86, March.
    2. Dmitry Arkhangelsky & Guido Imbens, 2023. "Causal Models for Longitudinal and Panel Data: A Survey," Papers 2311.15458, arXiv.org, revised Jun 2024.
    3. Peter Hull & Michal Kolesár & Christopher Walters, 2022. "Labor by design: contributions of David Card, Joshua Angrist, and Guido Imbens," Scandinavian Journal of Economics, Wiley Blackwell, vol. 124(3), pages 603-645, July.
    4. Roth, Jonathan & Sant’Anna, Pedro H.C. & Bilinski, Alyssa & Poe, John, 2023. "What’s trending in difference-in-differences? A synthesis of the recent econometrics literature," Journal of Econometrics, Elsevier, vol. 235(2), pages 2218-2244.
    5. William Torous & Florian Gunsilius & Philippe Rigollet, 2021. "An Optimal Transport Approach to Estimating Causal Effects via Nonlinear Difference-in-Differences," Papers 2108.05858, arXiv.org, revised Mar 2024.
    6. Guido W. Imbens, 2022. "Causality in Econometrics: Choice vs Chance," Econometrica, Econometric Society, vol. 90(6), pages 2541-2566, November.
    7. Gunsilius, Florian F., 2023. "A condition for the identification of multivariate models with binary instruments," Journal of Econometrics, Elsevier, vol. 235(1), pages 220-238.
    8. Yixiao Sun & Haitian Xie & Yuhang Zhang, 2025. "Difference-in-Differences Meets Synthetic Control: Doubly Robust Identification and Estimation," Papers 2503.11375, arXiv.org.
    9. Dennis Shen & Peng Ding & Jasjeet Sekhon & Bin Yu, 2022. "Same Root Different Leaves: Time Series and Cross-Sectional Methods in Panel Data," Papers 2207.14481, arXiv.org, revised Oct 2022.
    10. Pereda-Fernández, Santiago, 2023. "Identification and estimation of triangular models with a binary treatment," Journal of Econometrics, Elsevier, vol. 234(2), pages 585-623.
    11. Jeffrey Smith & Arthur Sweetman, 2016. "Viewpoint: Estimating the causal effects of policies and programs," Canadian Journal of Economics, Canadian Economics Association, vol. 49(3), pages 871-905, August.
    12. Enzo Brox & Riccardo Di Francesco, 2024. "The Cost of Coming Out," Papers 2403.03649, arXiv.org, revised Jun 2024.
    13. Kitagawa, Toru, 2021. "The identification region of the potential outcome distributions under instrument independence," Journal of Econometrics, Elsevier, vol. 225(2), pages 231-253.
    14. Susan Athey & Guido W. Imbens, 2017. "The State of Applied Econometrics: Causality and Policy Evaluation," Journal of Economic Perspectives, American Economic Association, vol. 31(2), pages 3-32, Spring.
    15. Callaway, Brantly, 2021. "Bounds on distributional treatment effect parameters using panel data with an application on job displacement," Journal of Econometrics, Elsevier, vol. 222(2), pages 861-881.
    16. Songnian Chen & Junlong Feng, 2023. "Group-Heterogeneous Changes-in-Changes and Distributional Synthetic Controls," Papers 2307.15313, arXiv.org.
    17. Sunil Mithas & Yanzhen Chen & Yatang Lin & Alysson De Oliveira Silveira, 2022. "On the causality and plausibility of treatment effects in operations management research," Production and Operations Management, Production and Operations Management Society, vol. 31(12), pages 4558-4571, December.
    18. van der Klaauw, Bas, 2014. "From micro data to causality: Forty years of empirical labor economics," Labour Economics, Elsevier, vol. 30(C), pages 88-97.
    19. F. F. Gunsilius, 2023. "Distributional Synthetic Controls," Econometrica, Econometric Society, vol. 91(3), pages 1105-1117, May.
    20. Masahiro Kato & Akari Ohda, 2023. "Asymptotically Unbiased Synthetic Control Methods by Density Matching," Papers 2307.11127, arXiv.org, revised Feb 2025.

    More about this item

    Statistics

    Access and download statistics

    Corrections

    All material on this site has been provided by the respective publishers and authors. You can help correct errors and omissions. When requesting a correction, please mention this item's handle: RePEc:arx:papers:2503.07811. See general information about how to correct material in RePEc.

    If you have authored this item and are not yet registered with RePEc, we encourage you to do it here. This allows to link your profile to this item. It also allows you to accept potential citations to this item that we are uncertain about.

    If CitEc recognized a bibliographic reference but did not link an item in RePEc to it, you can help with this form .

    If you know of missing items citing this one, you can help us creating those links by adding the relevant references in the same way as above, for each refering item. If you are a registered author of this item, you may also want to check the "citations" tab in your RePEc Author Service profile, as there may be some citations waiting for confirmation.

    For technical questions regarding this item, or to correct its authors, title, abstract, bibliographic or download information, contact: arXiv administrators (email available below). General contact details of provider: http://arxiv.org/ .

    Please note that corrections may take a couple of weeks to filter through the various RePEc services.

    IDEAS is a RePEc service. RePEc uses bibliographic data supplied by the respective publishers.