IDEAS home Printed from https://ideas.repec.org/p/arx/papers/2503.17290.html
   My bibliography  Save this paper

Calibration Strategies for Robust Causal Estimation: Theoretical and Empirical Insights on Propensity Score Based Estimators

Author

Listed:
  • Jan Rabenseifner
  • Sven Klaassen
  • Jannis Kueck
  • Philipp Bach

Abstract

The partitioning of data for estimation and calibration critically impacts the performance of propensity score based estimators like inverse probability weighting (IPW) and double/debiased machine learning (DML) frameworks. We extend recent advances in calibration techniques for propensity score estimation, improving the robustness of propensity scores in challenging settings such as limited overlap, small sample sizes, or unbalanced data. Our contributions are twofold: First, we provide a theoretical analysis of the properties of calibrated estimators in the context of DML. To this end, we refine existing calibration frameworks for propensity score models, with a particular emphasis on the role of sample-splitting schemes in ensuring valid causal inference. Second, through extensive simulations, we show that calibration reduces variance of inverse-based propensity score estimators while also mitigating bias in IPW, even in small-sample regimes. Notably, calibration improves stability for flexible learners (e.g., gradient boosting) while preserving the doubly robust properties of DML. A key insight is that, even when methods perform well without calibration, incorporating a calibration step does not degrade performance, provided that an appropriate sample-splitting approach is chosen.

Suggested Citation

  • Jan Rabenseifner & Sven Klaassen & Jannis Kueck & Philipp Bach, 2025. "Calibration Strategies for Robust Causal Estimation: Theoretical and Empirical Insights on Propensity Score Based Estimators," Papers 2503.17290, arXiv.org, revised Apr 2025.
  • Handle: RePEc:arx:papers:2503.17290
    as

    Download full text from publisher

    File URL: http://arxiv.org/pdf/2503.17290
    File Function: Latest version
    Download Restriction: no
    ---><---

    References listed on IDEAS

    as
    1. Daniele Ballinari & Nora Bearth, 2024. "Improving the Finite Sample Estimation of Average Treatment Effects using Double/Debiased Machine Learning with Propensity Score Calibration," Papers 2409.04874, arXiv.org, revised Jan 2025.
    2. Hainmueller, Jens, 2012. "Entropy Balancing for Causal Effects: A Multivariate Reweighting Method to Produce Balanced Samples in Observational Studies," Political Analysis, Cambridge University Press, vol. 20(1), pages 25-46, January.
    3. Ballinari, Daniele, 2024. "Calibrating doubly-robust estimators with unbalanced treatment assignment," Economics Letters, Elsevier, vol. 241(C).
    4. Matias Busso & John DiNardo & Justin McCrary, 2014. "New Evidence on the Finite Sample Properties of Propensity Score Reweighting and Matching Estimators," The Review of Economics and Statistics, MIT Press, vol. 96(5), pages 885-897, December.
    5. Tilmann Gneiting & Fadoua Balabdaoui & Adrian E. Raftery, 2007. "Probabilistic forecasts, calibration and sharpness," Journal of the Royal Statistical Society Series B, Royal Statistical Society, vol. 69(2), pages 243-268, April.
    6. A. Belloni & V. Chernozhukov & I. Fernández‐Val & C. Hansen, 2017. "Program Evaluation and Causal Inference With High‐Dimensional Data," Econometrica, Econometric Society, vol. 85, pages 233-298, January.
    7. Bryan S. Graham & Cristine Campos De Xavier Pinto & Daniel Egel, 2012. "Inverse Probability Tilting for Moment Condition Models with Missing Data," The Review of Economic Studies, Review of Economic Studies Ltd, vol. 79(3), pages 1053-1079.
    8. Kosuke Imai & Marc Ratkovic, 2014. "Covariate balancing propensity score," Journal of the Royal Statistical Society Series B, Royal Statistical Society, vol. 76(1), pages 243-263, January.
    9. José R. Zubizarreta, 2015. "Stable Weights that Balance Covariates for Estimation With Incomplete Outcome Data," Journal of the American Statistical Association, Taylor & Francis Journals, vol. 110(511), pages 910-922, September.
    10. Xinkun Nie & Stefan Wager, 2017. "Quasi-Oracle Estimation of Heterogeneous Treatment Effects," Papers 1712.04912, arXiv.org, revised Aug 2020.
    11. Mario V. Wuthrich & Johanna Ziegel, 2023. "Isotonic Recalibration under a Low Signal-to-Noise Ratio," Papers 2301.02692, arXiv.org.
    12. Guido W. Imbens, 2004. "Nonparametric Estimation of Average Treatment Effects Under Exogeneity: A Review," The Review of Economics and Statistics, MIT Press, vol. 86(1), pages 4-29, February.
    13. Victor Chernozhukov & Whitney K. Newey & Rahul Singh, 2022. "Automatic Debiased Machine Learning of Causal and Structural Effects," Econometrica, Econometric Society, vol. 90(3), pages 967-1027, May.
    14. Victor Chernozhukov & Whitney K. Newey & Victor Quintas-Martinez & Vasilis Syrgkanis, 2021. "Automatic Debiased Machine Learning via Riesz Regression," Papers 2104.14737, arXiv.org, revised Mar 2024.
    15. Philipp Bach & Victor Chernozhukov & Malte S. Kurz & Martin Spindler & Sven Klaassen, 2021. "DoubleML -- An Object-Oriented Implementation of Double Machine Learning in R," Papers 2103.09603, arXiv.org, revised Jun 2024.
    Full references (including those not matched with items on IDEAS)

    Most related items

    These are the items that most often cite the same works as this one and are cited by the same works as this one.
    1. Ganesh Karapakula, 2023. "Stable Probability Weighting: Large-Sample and Finite-Sample Estimation and Inference Methods for Heterogeneous Causal Effects of Multivalued Treatments Under Limited Overlap," Papers 2301.05703, arXiv.org, revised Jan 2023.
    2. Pedro H. C. Sant'Anna & Xiaojun Song & Qi Xu, 2022. "Covariate distribution balance via propensity scores," Journal of Applied Econometrics, John Wiley & Sons, Ltd., vol. 37(6), pages 1093-1120, September.
    3. Huber, Martin, 2019. "An introduction to flexible methods for policy evaluation," FSES Working Papers 504, Faculty of Economics and Social Sciences, University of Freiburg/Fribourg Switzerland.
    4. Michael Lechner & Jana Mareckova, 2024. "Comprehensive Causal Machine Learning," Papers 2405.10198, arXiv.org, revised Feb 2025.
    5. Frölich, Markus & Huber, Martin & Wiesenfarth, Manuel, 2017. "The finite sample performance of semi- and non-parametric estimators for treatment effects and policy evaluation," Computational Statistics & Data Analysis, Elsevier, vol. 115(C), pages 91-102.
    6. Susan Athey & Guido W. Imbens, 2017. "The State of Applied Econometrics: Causality and Policy Evaluation," Journal of Economic Perspectives, American Economic Association, vol. 31(2), pages 3-32, Spring.
    7. Cousineau, Martin & Verter, Vedat & Murphy, Susan A. & Pineau, Joelle, 2023. "Estimating causal effects with optimization-based methods: A review and empirical comparison," European Journal of Operational Research, Elsevier, vol. 304(2), pages 367-380.
    8. Hugo Bodory & Lorenzo Camponovo & Martin Huber & Michael Lechner, 2020. "The Finite Sample Performance of Inference Methods for Propensity Score Matching and Weighting Estimators," Journal of Business & Economic Statistics, Taylor & Francis Journals, vol. 38(1), pages 183-200, January.
    9. Phillip Heiler, 2022. "Efficient Covariate Balancing for the Local Average Treatment Effect," Journal of Business & Economic Statistics, Taylor & Francis Journals, vol. 40(4), pages 1569-1582, October.
    10. Martin Cousineau & Vedat Verter & Susan A. Murphy & Joelle Pineau, 2022. "Estimating causal effects with optimization-based methods: A review and empirical comparison," Papers 2203.00097, arXiv.org.
    11. Susan Athey & Guido W. Imbens & Stefan Wager, 2018. "Approximate residual balancing: debiased inference of average treatment effects in high dimensions," Journal of the Royal Statistical Society Series B, Royal Statistical Society, vol. 80(4), pages 597-623, September.
    12. Hugo Bodory & Martin Huber & Michael Lechner, 2024. "The Finite Sample Performance of Instrumental Variable-Based Estimators of the Local Average Treatment Effect When Controlling for Covariates," Computational Economics, Springer;Society for Computational Economics, vol. 64(4), pages 2053-2078, October.
    13. Tymon Słoczyński & S. Derya Uysal & Jeffrey M. Wooldridge, 2025. "Abadie’s Kappa and Weighting Estimators of the Local Average Treatment Effect," Journal of Business & Economic Statistics, Taylor & Francis Journals, vol. 43(1), pages 164-177, January.
    14. Dmitry Arkhangelsky & Susan Athey & David A. Hirshberg & Guido W. Imbens & Stefan Wager, 2021. "Synthetic Difference-in-Differences," American Economic Review, American Economic Association, vol. 111(12), pages 4088-4118, December.
    15. Dongcheng Zhang & Kunpeng Zhang, 2020. "Weighting-Based Treatment Effect Estimation via Distribution Learning," Papers 2012.13805, arXiv.org, revised May 2023.
    16. Xu, Wenfu & Tan, Zhiqiang, 2024. "High-dimensional model-assisted inference for treatment effects with multi-valued treatments," Journal of Econometrics, Elsevier, vol. 244(1).
    17. Jason J. Sauppe & Sheldon H. Jacobson, 2017. "The role of covariate balance in observational studies," Naval Research Logistics (NRL), John Wiley & Sons, vol. 64(4), pages 323-344, June.
    18. Sean Yiu & Li Su, 2022. "Joint calibrated estimation of inverse probability of treatment and censoring weights for marginal structural models," Biometrics, The International Biometric Society, vol. 78(1), pages 115-127, March.
    19. Zhang, Xiaoke & Xue, Wu & Wang, Qiyue, 2021. "Covariate balancing functional propensity score for functional treatments in cross-sectional observational studies," Computational Statistics & Data Analysis, Elsevier, vol. 163(C).
    20. Yimin Dai & Ying Yan, 2024. "Mahalanobis balancing: A multivariate perspective on approximate covariate balancing," Scandinavian Journal of Statistics, Danish Society for Theoretical Statistics;Finnish Statistical Society;Norwegian Statistical Association;Swedish Statistical Association, vol. 51(4), pages 1450-1471, December.

    More about this item

    Statistics

    Access and download statistics

    Corrections

    All material on this site has been provided by the respective publishers and authors. You can help correct errors and omissions. When requesting a correction, please mention this item's handle: RePEc:arx:papers:2503.17290. See general information about how to correct material in RePEc.

    If you have authored this item and are not yet registered with RePEc, we encourage you to do it here. This allows to link your profile to this item. It also allows you to accept potential citations to this item that we are uncertain about.

    If CitEc recognized a bibliographic reference but did not link an item in RePEc to it, you can help with this form .

    If you know of missing items citing this one, you can help us creating those links by adding the relevant references in the same way as above, for each refering item. If you are a registered author of this item, you may also want to check the "citations" tab in your RePEc Author Service profile, as there may be some citations waiting for confirmation.

    For technical questions regarding this item, or to correct its authors, title, abstract, bibliographic or download information, contact: arXiv administrators (email available below). General contact details of provider: http://arxiv.org/ .

    Please note that corrections may take a couple of weeks to filter through the various RePEc services.

    IDEAS is a RePEc service. RePEc uses bibliographic data supplied by the respective publishers.