IDEAS home Printed from https://ideas.repec.org/p/arx/papers/2511.04658.html

Where to Experiment? Site Selection Under Distribution Shift via Optimal Transport and Wasserstein DRO

Author

Listed:
  • Adam Bouyamourn

Abstract

How should researchers select experimental sites when the deployment population differs from observed data? I formulate the problem of experimental site selection as an optimal transport problem, developing methods to minimize downstream estimation error by choosing sites that minimize the Wasserstein distance between population and sample covariate distributions. I develop new theoretical upper bounds on PATE and CATE estimation errors, and show that these different objectives lead to different site selection strategies. I extend this approach by using Wasserstein Distributionally Robust Optimization to develop a site selection procedure robust to adversarial perturbations of covariate information: a specific model of distribution shift. I also propose a novel data-driven procedure for selecting the uncertainty radius the Wasserstein DRO problem, which allows the user to benchmark robustness levels against observed variation in their data. Simulation evidence, and a reanalysis of a randomized microcredit experiment in Morocco (Cr\'epon et al.), show that these methods outperform random and stratified sampling of sites when covariates have prognostic R-squared > .5, and alternative optimization methods i) for moderate-to-large size problem instances ii) when covariates are moderately informative about treatment effects, and iii) under induced distribution shift.

Suggested Citation

  • Adam Bouyamourn, 2025. "Where to Experiment? Site Selection Under Distribution Shift via Optimal Transport and Wasserstein DRO," Papers 2511.04658, arXiv.org.
  • Handle: RePEc:arx:papers:2511.04658
    as

    Download full text from publisher

    File URL: http://arxiv.org/pdf/2511.04658
    File Function: Latest version
    Download Restriction: no
    ---><---

    References listed on IDEAS

    as
    1. Abhijit Banerjee & Rukmini Banerji & James Berry & Esther Duflo & Harini Kannan & Shobhini Mukerji & Marc Shotland & Michael Walton, 2017. "From Proof of Concept to Scalable Policies: Challenges and Solutions, with an Application," Journal of Economic Perspectives, American Economic Association, vol. 31(4), pages 73-102, Fall.
    2. Arthur Charpentier & Emmanuel Flachaire & Ewen Gallic, 2023. "Optimal Transport for Counterfactual Estimation: A Method for Causal Inference," Papers 2301.07755, arXiv.org.
    3. Daniel Zhuoyu Long & Jin Qi & Aiqi Zhang, 2024. "Supermodularity in Two-Stage Distributionally Robust Optimization," Management Science, INFORMS, vol. 70(3), pages 1394-1409, March.
    4. Alfred Galichon, 2016. "Optimal Transport Methods in Economics," Economics Books, Princeton University Press, edition 1, number 10870.
    5. Abhijit V. Banerjee & Shawn Cole & Esther Duflo & Leigh Linden, 2007. "Remedying Education: Evidence from Two Randomized Experiments in India," The Quarterly Journal of Economics, President and Fellows of Harvard College, vol. 122(3), pages 1235-1264.
    6. Alberto Abadie & Jinglong Zhao, 2021. "Synthetic Controls for Experimental Design," Papers 2108.02196, arXiv.org, revised Apr 2025.
    7. Alfred Galichon, 2016. "Optimal transport methods in economics," Sciences Po Economics Publications (main) hal-03256830, HAL.
    8. Bold, Tessa & Kimenyi, Mwangi & Mwabu, Germano & Ng’ang’a, Alice & Sandefur, Justin, 2018. "Experimental evidence on scaling up education reforms in Kenya," Journal of Public Economics, Elsevier, vol. 168(C), pages 1-20.
    9. Dimitris Bertsimas & Melvyn Sim, 2004. "The Price of Robustness," Operations Research, INFORMS, vol. 52(1), pages 35-53, February.
    10. Zhengyang Fan & Ran Ji & Miguel A. Lejeune, 2024. "Distributionally Robust Portfolio Optimization under Marginal and Copula Ambiguity," Journal of Optimization Theory and Applications, Springer, vol. 203(3), pages 2870-2907, December.
    11. Bruno Crépon & Florencia Devoto & Esther Duflo & William Parienté, 2015. "Estimating the Impact of Microcredit on Those Who Take It Up: Evidence from a Randomized Experiment in Morocco," American Economic Journal: Applied Economics, American Economic Association, vol. 7(1), pages 123-150, January.
    12. Alfred Galichon, 2016. "Optimal transport methods in economics," SciencePo Working papers hal-03256830, HAL.
    13. Yueyao Li & Wenxun Xing, 2024. "Globalized distributionally robust optimization based on samples," Journal of Global Optimization, Springer, vol. 88(4), pages 871-900, April.
    14. Abadie, Alberto & Diamond, Alexis & Hainmueller, Jens, 2010. "Synthetic Control Methods for Comparative Case Studies: Estimating the Effect of California’s Tobacco Control Program," Journal of the American Statistical Association, American Statistical Association, vol. 105(490), pages 493-505.
    15. Hunt Allcott, 2015. "Site Selection Bias in Program Evaluation," The Quarterly Journal of Economics, President and Fellows of Harvard College, vol. 130(3), pages 1117-1165.
    16. Tyler J. VanderWeele & Ilya Shpitser, 2011. "A New Criterion for Confounder Selection," Biometrics, The International Biometric Society, vol. 67(4), pages 1406-1413, December.
    17. Egami, Naoki & Hartman, Erin, 2023. "Elements of External Validity: Framework, Design, and Analysis," American Political Science Review, Cambridge University Press, vol. 117(3), pages 1070-1088, August.
    18. Torous William & Gunsilius Florian & Rigollet Philippe, 2024. "An optimal transport approach to estimating causal effects via nonlinear difference-in-differences," Journal of Causal Inference, De Gruyter, vol. 12(1), pages 1-26.
    19. Yuchen Hu & Henry Zhu & Emma Brunskill & Stefan Wager, 2024. "Minimax-Regret Sample Selection in Randomized Experiments," Papers 2403.01386, arXiv.org, revised Jun 2024.
    20. Alberto Abadie & Javier Gardeazabal, 2003. "The Economic Costs of Conflict: A Case Study of the Basque Country," American Economic Review, American Economic Association, vol. 93(1), pages 113-132, March.
    21. Jose Blanchet & Karthyek Murthy, 2019. "Quantifying Distributional Model Risk via Optimal Transport," Mathematics of Operations Research, INFORMS, vol. 44(2), pages 565-600, May.
    22. William Torous & Florian Gunsilius & Philippe Rigollet, 2021. "An Optimal Transport Approach to Estimating Causal Effects via Nonlinear Difference-in-Differences," Papers 2108.05858, arXiv.org, revised Mar 2024.
    23. Alfred Galichon, 2016. "Optimal transport methods in economics," Post-Print hal-03256830, HAL.
    Full references (including those not matched with items on IDEAS)

    Most related items

    These are the items that most often cite the same works as this one and are cited by the same works as this one.
    1. Andrei Voronin, 2025. "Generalized Optimal Transport," Papers 2507.22422, arXiv.org.
    2. Haiyan Liu & Bin Wang & Ruodu Wang & Sheng Chao Zhuang, 2023. "Distorted optimal transport," Papers 2308.11238, arXiv.org, revised May 2025.
    3. repec:osf:osfxxx:nwp8k_v1 is not listed on IDEAS
    4. Susanne Schennach & Vincent Starck, 2025. "Optimally-Transported Generalized Method of Moments," Papers 2511.05712, arXiv.org.
    5. Itai Arieli & Yakov Babichenko & Fedor Sandomirskiy, 2023. "Feasible Conditional Belief Distributions," Papers 2307.07672, arXiv.org, revised Nov 2024.
    6. Beatrice Acciaio & Berenice Anne Neumann, 2025. "Characterization of transport optimizers via graphs and applications to Stackelberg–Cournot–Nash equilibria," Mathematics and Financial Economics, Springer, volume 19, number 3, December.
    7. Andrew Lyasoff, 2023. "The Time-Interlaced Self-Consistent Master System of Heterogeneous-Agent Models," Papers 2303.12567, arXiv.org, revised May 2025.
    8. Florian F Gunsilius, 2025. "A primer on optimal transport for causal inference with observational data," Papers 2503.07811, arXiv.org, revised Mar 2025.
    9. Roger Koenker, 2017. "Quantile regression 40 years on," CeMMAP working papers 36/17, Institute for Fiscal Studies.
    10. Kuan‐Ming Chen & Yu‐Wei Hsieh & Ming‐Jen Lin, 2023. "Reducing Recommendation Inequality Via Two‐Sided Matching: A Field Experiment Of Online Dating," International Economic Review, Department of Economics, University of Pennsylvania and Osaka University Institute of Social and Economic Research Association, vol. 64(3), pages 1201-1221, August.
    11. Keita Sunada & Kohei Izumi, 2025. "Optimal treatment assignment rules under capacity constraints," Papers 2506.12225, arXiv.org, revised Sep 2025.
    12. Arthur Charpentier & Alfred Galichon & Lucas Vernet, 2019. "Optimal transport on large networks a practitioner guide," Sciences Po Economics Publications (main) hal-02173210, HAL.
    13. Yagan Hazard & Toru Kitagawa, 2025. "Who With Whom? Learning Optimal Matching Policies," Papers 2507.13567, arXiv.org.
    14. Masselus, Lise & Petrik, Christina & Ankel-Peters, Jörg, 2024. "Lost in the design space? Construct validity in the microfinance literature," Ruhr Economic Papers 1097, RWI - Leibniz-Institut für Wirtschaftsforschung, Ruhr-University Bochum, TU Dortmund University, University of Duisburg-Essen.
    15. Cetin, Umut, 2025. "Insider trading with penalties in continuous time," LSE Research Online Documents on Economics 128957, London School of Economics and Political Science, LSE Library.
    16. Alfred Galichon & Bernard Salanié, 2023. "Structural Estimation of Matching Markets with Transferable Utility," Post-Print hal-03935865, HAL.
    17. Ashwin Kambhampati & Carlos Segura‐Rodriguez, 2022. "The optimal assortativity of teams inside the firm," RAND Journal of Economics, RAND Corporation, vol. 53(3), pages 484-515, September.
    18. Omar Abdul Halim & Brendan Pass, 2025. "Multi-to -one dimensional and semi-discrete screening," Papers 2506.21740, arXiv.org, revised Oct 2025.
    19. Wayne Yuan Gao & Rui Wang, 2023. "IV Regressions without Exclusion Restrictions," Papers 2304.00626, arXiv.org, revised Jul 2023.
    20. Florian Gunsilius & Susanne M. Schennach, 2017. "A nonlinear principal component decomposition," CeMMAP working papers CWP16/17, Centre for Microdata Methods and Practice, Institute for Fiscal Studies.
    21. Jason T. Kerwin & Rebecca L. Thornton, 2021. "Making the Grade: The Sensitivity of Education Program Effectiveness to Input Choices and Outcome Measures," The Review of Economics and Statistics, MIT Press, vol. 103(2), pages 251-264, May.

    More about this item

    NEP fields

    This paper has been announced in the following NEP Reports:

    Statistics

    Access and download statistics

    Corrections

    All material on this site has been provided by the respective publishers and authors. You can help correct errors and omissions. When requesting a correction, please mention this item's handle: RePEc:arx:papers:2511.04658. See general information about how to correct material in RePEc.

    If you have authored this item and are not yet registered with RePEc, we encourage you to do it here. This allows to link your profile to this item. It also allows you to accept potential citations to this item that we are uncertain about.

    If CitEc recognized a bibliographic reference but did not link an item in RePEc to it, you can help with this form .

    If you know of missing items citing this one, you can help us creating those links by adding the relevant references in the same way as above, for each refering item. If you are a registered author of this item, you may also want to check the "citations" tab in your RePEc Author Service profile, as there may be some citations waiting for confirmation.

    For technical questions regarding this item, or to correct its authors, title, abstract, bibliographic or download information, contact: arXiv administrators (email available below). General contact details of provider: http://arxiv.org/ .

    Please note that corrections may take a couple of weeks to filter through the various RePEc services.

    IDEAS is a RePEc service. RePEc uses bibliographic data supplied by the respective publishers.