IDEAS home Printed from https://ideas.repec.org/p/arx/papers/2511.04658.html

Where to Experiment? Site Selection Under Distribution Shift via Optimal Transport and Wasserstein DRO

Author

Listed:
  • Adam Bouyamourn

Abstract

How should researchers select experimental sites when the deployment population differs from observed data? I formulate the problem of experimental site selection as an optimal transport problem, developing methods to minimize downstream estimation error by choosing sites that minimize the Wasserstein distance between population and sample covariate distributions. I develop new theoretical upper bounds on PATE and CATE estimation errors, and show that these different objectives lead to different site selection strategies. I extend this approach by using Wasserstein Distributionally Robust Optimization to develop a site selection procedure robust to adversarial perturbations of covariate information: a specific model of distribution shift. I also propose a novel data-driven procedure for selecting the uncertainty radius the Wasserstein DRO problem, which allows the user to benchmark robustness levels against observed variation in their data. Simulation evidence, and a reanalysis of a randomized microcredit experiment in Morocco (Cr\'epon et al.), show that these methods outperform random and stratified sampling of sites when covariates have prognostic R-squared > .5, and alternative optimization methods i) for moderate-to-large size problem instances ii) when covariates are moderately informative about treatment effects, and iii) under induced distribution shift.

Suggested Citation

  • Adam Bouyamourn, 2025. "Where to Experiment? Site Selection Under Distribution Shift via Optimal Transport and Wasserstein DRO," Papers 2511.04658, arXiv.org.
  • Handle: RePEc:arx:papers:2511.04658
    as

    Download full text from publisher

    File URL: http://arxiv.org/pdf/2511.04658
    File Function: Latest version
    Download Restriction: no
    ---><---

    References listed on IDEAS

    as
    1. Abhijit Banerjee & Rukmini Banerji & James Berry & Esther Duflo & Harini Kannan & Shobhini Mukerji & Marc Shotland & Michael Walton, 2017. "From Proof of Concept to Scalable Policies: Challenges and Solutions, with an Application," Journal of Economic Perspectives, American Economic Association, vol. 31(4), pages 73-102, Fall.
    2. Arthur Charpentier & Emmanuel Flachaire & Ewen Gallic, 2023. "Optimal Transport for Counterfactual Estimation: A Method for Causal Inference," Papers 2301.07755, arXiv.org.
    3. Daniel Zhuoyu Long & Jin Qi & Aiqi Zhang, 2024. "Supermodularity in Two-Stage Distributionally Robust Optimization," Management Science, INFORMS, vol. 70(3), pages 1394-1409, March.
    4. Alfred Galichon, 2016. "Optimal Transport Methods in Economics," Economics Books, Princeton University Press, edition 1, number 10870, December.
    5. Abhijit V. Banerjee & Shawn Cole & Esther Duflo & Leigh Linden, 2007. "Remedying Education: Evidence from Two Randomized Experiments in India," The Quarterly Journal of Economics, President and Fellows of Harvard College, vol. 122(3), pages 1235-1264.
    6. Alberto Abadie & Jinglong Zhao, 2021. "Synthetic Controls for Experimental Design," Papers 2108.02196, arXiv.org, revised Apr 2025.
    7. Alfred Galichon, 2016. "Optimal transport methods in economics," Sciences Po Economics Publications (main) hal-03256830, HAL.
    8. Bold, Tessa & Kimenyi, Mwangi & Mwabu, Germano & Ng’ang’a, Alice & Sandefur, Justin, 2018. "Experimental evidence on scaling up education reforms in Kenya," Journal of Public Economics, Elsevier, vol. 168(C), pages 1-20.
    9. Dimitris Bertsimas & Melvyn Sim, 2004. "The Price of Robustness," Operations Research, INFORMS, vol. 52(1), pages 35-53, February.
    10. Zhengyang Fan & Ran Ji & Miguel A. Lejeune, 2024. "Distributionally Robust Portfolio Optimization under Marginal and Copula Ambiguity," Journal of Optimization Theory and Applications, Springer, vol. 203(3), pages 2870-2907, December.
    11. Bruno Crépon & Florencia Devoto & Esther Duflo & William Parienté, 2015. "Estimating the Impact of Microcredit on Those Who Take It Up: Evidence from a Randomized Experiment in Morocco," American Economic Journal: Applied Economics, American Economic Association, vol. 7(1), pages 123-150, January.
    12. Alfred Galichon, 2016. "Optimal transport methods in economics," SciencePo Working papers hal-03256830, HAL.
    13. Yueyao Li & Wenxun Xing, 2024. "Globalized distributionally robust optimization based on samples," Journal of Global Optimization, Springer, vol. 88(4), pages 871-900, April.
    14. Abadie, Alberto & Diamond, Alexis & Hainmueller, Jens, 2010. "Synthetic Control Methods for Comparative Case Studies: Estimating the Effect of California’s Tobacco Control Program," Journal of the American Statistical Association, American Statistical Association, vol. 105(490), pages 493-505.
    15. Hunt Allcott, 2015. "Site Selection Bias in Program Evaluation," The Quarterly Journal of Economics, President and Fellows of Harvard College, vol. 130(3), pages 1117-1165.
    16. Tyler J. VanderWeele & Ilya Shpitser, 2011. "A New Criterion for Confounder Selection," Biometrics, The International Biometric Society, vol. 67(4), pages 1406-1413, December.
    17. Egami, Naoki & Hartman, Erin, 2023. "Elements of External Validity: Framework, Design, and Analysis," American Political Science Review, Cambridge University Press, vol. 117(3), pages 1070-1088, August.
    18. Torous William & Gunsilius Florian & Rigollet Philippe, 2024. "An optimal transport approach to estimating causal effects via nonlinear difference-in-differences," Journal of Causal Inference, De Gruyter, vol. 12(1), pages 1-26.
    19. Yuchen Hu & Henry Zhu & Emma Brunskill & Stefan Wager, 2024. "Minimax-Regret Sample Selection in Randomized Experiments," Papers 2403.01386, arXiv.org, revised Jun 2024.
    20. Alberto Abadie & Javier Gardeazabal, 2003. "The Economic Costs of Conflict: A Case Study of the Basque Country," American Economic Review, American Economic Association, vol. 93(1), pages 113-132, March.
    21. Jose Blanchet & Karthyek Murthy, 2019. "Quantifying Distributional Model Risk via Optimal Transport," Mathematics of Operations Research, INFORMS, vol. 44(2), pages 565-600, May.
    22. William Torous & Florian Gunsilius & Philippe Rigollet, 2021. "An Optimal Transport Approach to Estimating Causal Effects via Nonlinear Difference-in-Differences," Papers 2108.05858, arXiv.org, revised Mar 2024.
    23. Alfred Galichon, 2016. "Optimal transport methods in economics," Post-Print hal-03256830, HAL.
    Full references (including those not matched with items on IDEAS)

    Citations

    Citations are extracted by the CitEc Project, subscribe to its RSS feed for this item.
    as


    Cited by:

    1. Alfred Galichon & Marc Henry, 2026. "An econometrician's guide to optimal transport," Papers 2604.04227, arXiv.org.

    Most related items

    These are the items that most often cite the same works as this one and are cited by the same works as this one.
    1. Xinran Liu, 2026. "Recovering Counterfactual Distributions via Wasserstein GANs," Papers 2601.17296, arXiv.org.
    2. Andrei Voronin, 2025. "Generalized Optimal Transport," Papers 2507.22422, arXiv.org.
    3. Alfred Galichon & Marc Henry, 2026. "An econometrician's guide to optimal transport," Papers 2604.04227, arXiv.org.
    4. Haiyan Liu & Bin Wang & Ruodu Wang & Sheng Chao Zhuang, 2023. "Distorted optimal transport," Papers 2308.11238, arXiv.org, revised May 2025.
    5. repec:osf:osfxxx:nwp8k_v1 is not listed on IDEAS
    6. Jakub Ryłow, 2026. "Topological Methods in Economics: From Equilibrium Existence to Topological Data Analysis," Working Papers 2026-9, Faculty of Economic Sciences, University of Warsaw.
    7. Susanne Schennach & Vincent Starck, 2026. "Optimally‐Transported Generalized Method of Moments," Econometrica, Econometric Society, vol. 94(2), pages 619-640, March.
    8. Mario Ghossoub & Jesse Hall & David Saunders, 2023. "Maximum Spectral Measures of Risk with Given Risk Factor Marginal Distributions," Mathematics of Operations Research, INFORMS, vol. 48(2), pages 1158-1182, May.
    9. Itai Arieli & Yakov Babichenko & Fedor Sandomirskiy, 2023. "Feasible Conditional Belief Distributions," Papers 2307.07672, arXiv.org, revised Nov 2024.
    10. Beatrice Acciaio & Berenice Anne Neumann, 2025. "Characterization of transport optimizers via graphs and applications to Stackelberg–Cournot–Nash equilibria," Mathematics and Financial Economics, Springer, volume 19, number 3, December.
    11. Masselus, Lise & Petrik, Christina & Ankel-Peters, Jörg, 2024. "Lost in the design space? Construct validity in the microfinance literature," Ruhr Economic Papers 1097, RWI - Leibniz-Institut für Wirtschaftsforschung, Ruhr-University Bochum, TU Dortmund University, University of Duisburg-Essen.
    12. Louis Chen & Will Ma & Karthik Natarajan & David Simchi-Levi & Zhenzhen Yan, 2022. "Distributionally Robust Linear and Discrete Optimization with Marginals," Operations Research, INFORMS, vol. 70(3), pages 1822-1834, May.
    13. Andrew Lyasoff, 2023. "The Time-Interlaced Self-Consistent Master System of Heterogeneous-Agent Models," Papers 2303.12567, arXiv.org, revised May 2025.
    14. Florian F Gunsilius, 2025. "A primer on optimal transport for causal inference with observational data," Papers 2503.07811, arXiv.org, revised Mar 2025.
    15. João Pedro M. Franco & Márcio Laurini, 2025. "Multivariate Risk Analysis in Cryptocurrency Market: An Optimal Transport Approach," Computational Economics, Springer;Society for Computational Economics, vol. 66(6), pages 5257-5298, December.
    16. Rami V. Tabri, 2026. "Distributional Change in Ordinal Data with Missing Observations: Minimal Mobility and Partial Identification," Papers 2604.12611, arXiv.org, revised Apr 2026.
    17. Buhai, Ioan-Sebastian, 2026. "The Geometry of Heterogeneous Extremes: Optimal Transport and Entropic Design," IZA Discussion Papers 18511, IZA Network @ LISER.
    18. Roger Koenker, 2017. "Quantile regression 40 years on," CeMMAP working papers 36/17, Institute for Fiscal Studies.
    19. Kuan‐Ming Chen & Yu‐Wei Hsieh & Ming‐Jen Lin, 2023. "Reducing Recommendation Inequality Via Two‐Sided Matching: A Field Experiment Of Online Dating," International Economic Review, Department of Economics, University of Pennsylvania and Osaka University Institute of Social and Economic Research Association, vol. 64(3), pages 1201-1221, August.
    20. Keita Sunada & Kohei Izumi, 2025. "Optimal treatment assignment rules under capacity constraints," Papers 2506.12225, arXiv.org, revised Sep 2025.
    21. Arthur Charpentier & Alfred Galichon & Lucas Vernet, 2019. "Optimal transport on large networks a practitioner guide," Sciences Po Economics Publications (main) hal-02173210, HAL.

    More about this item

    NEP fields

    This paper has been announced in the following NEP Reports:

    Statistics

    Access and download statistics

    Corrections

    All material on this site has been provided by the respective publishers and authors. You can help correct errors and omissions. When requesting a correction, please mention this item's handle: RePEc:arx:papers:2511.04658. See general information about how to correct material in RePEc.

    If you have authored this item and are not yet registered with RePEc, we encourage you to do it here. This allows to link your profile to this item. It also allows you to accept potential citations to this item that we are uncertain about.

    If CitEc recognized a bibliographic reference but did not link an item in RePEc to it, you can help with this form .

    If you know of missing items citing this one, you can help us creating those links by adding the relevant references in the same way as above, for each refering item. If you are a registered author of this item, you may also want to check the "citations" tab in your RePEc Author Service profile, as there may be some citations waiting for confirmation.

    For technical questions regarding this item, or to correct its authors, title, abstract, bibliographic or download information, contact: arXiv administrators (email available below). General contact details of provider: http://arxiv.org/ .

    Please note that corrections may take a couple of weeks to filter through the various RePEc services.

    IDEAS is a RePEc service. RePEc uses bibliographic data supplied by the respective publishers.