IDEAS home Printed from https://ideas.repec.org/p/arx/papers/2604.06123.html

A Large-Scale Empirical Comparison of Meta-Learners and Causal Forests for Heterogeneous Treatment Effect Estimation in Marketing Uplift Modeling

Author

Listed:
  • Aman Singh

Abstract

Estimating Conditional Average Treatment Effects (CATE) at the individual level is central to precision marketing, yet systematic benchmarking of uplift modeling methods at industrial scale remains limited. We present UpliftBench, an empirical evaluation of four CATE estimators: S-Learner, T-Learner, X-Learner (all with LightGBM base learners), and Causal Forest (EconML), applied to the Criteo Uplift v2.1 dataset comprising 13.98 million customer records. The near-random treatment assignment (propensity AUC = 0.509) provides strong internal validity for causal estimation. Evaluated via Qini coefficient and cumulative gain curves, the S-Learner achieves the highest Qini score of 0.376, with the top 20% of customers ranked by predicted CATE capturing 77.7% of all incremental conversions, a 3.9x improvement over random targeting. SHAP analysis identifies f8 as the dominant heterogeneous treatment effect (HTE) driver among the 12 anonymized covariates. Causal Forest uncertainty quantification reveals that 1.9% of customers are confident persuadables (lower 95% CI > 0) and 0.1% are confident sleeping dogs (upper 95% CI

Suggested Citation

  • Aman Singh, 2026. "A Large-Scale Empirical Comparison of Meta-Learners and Causal Forests for Heterogeneous Treatment Effect Estimation in Marketing Uplift Modeling," Papers 2604.06123, arXiv.org.
  • Handle: RePEc:arx:papers:2604.06123
    as

    Download full text from publisher

    File URL: http://arxiv.org/pdf/2604.06123
    File Function: Latest version
    Download Restriction: no
    ---><---

    References listed on IDEAS

    as
    1. Stefan Wager & Susan Athey, 2018. "Estimation and Inference of Heterogeneous Treatment Effects using Random Forests," Journal of the American Statistical Association, Taylor & Francis Journals, vol. 113(523), pages 1228-1242, July.
    Full references (including those not matched with items on IDEAS)

    Most related items

    These are the items that most often cite the same works as this one and are cited by the same works as this one.
    1. Lechner, Michael, 2018. "Modified Causal Forests for Estimating Heterogeneous Causal Effects," IZA Discussion Papers 12040, IZA Network @ LISER.
    2. William Arbour, 2021. "Can Recidivism be Prevented from Behind Bars? Evidence from a Behavioral Program," Working Papers tecipa-683, University of Toronto, Department of Economics.
    3. Alexandre Belloni & Victor Chernozhukov & Denis Chetverikov & Christian Hansen & Kengo Kato, 2018. "High-dimensional econometrics and regularized GMM," CeMMAP working papers CWP35/18, Centre for Microdata Methods and Practice, Institute for Fiscal Studies.
    4. Dimitris Bertsimas & Agni Orfanoudaki & Rory B. Weiner, 2020. "Personalized treatment for coronary artery disease patients: a machine learning approach," Health Care Management Science, Springer, vol. 23(4), pages 482-506, December.
    5. Jiang, Mobing & Chen, Xinyu & Xiao, Mingyue & Zhang, Yuning & Wen, Wu & Chen, Xiaohua, 2025. "From connect to conquer: capital market liberalization and Chinese firms' cross-border mergers and acquisitions," Pacific-Basin Finance Journal, Elsevier, vol. 93(C).
    6. Justin Whitehouse & Qizhao Chen & Morgane Austern & Vasilis Syrgkanis, 2025. "Inference on Optimal Policy Values and Other Irregular Functionals via Softmax Smoothing," Papers 2507.11780, arXiv.org, revised Mar 2026.
    7. Nicolaj N. Mühlbach, 2020. "Tree-based Synthetic Control Methods: Consequences of moving the US Embassy," CREATES Research Papers 2020-04, Department of Economics and Business Economics, Aarhus University.
    8. Kyle Colangelo & Ying-Ying Lee, 2019. "Double debiased machine learning nonparametric inference with continuous treatments," CeMMAP working papers CWP72/19, Centre for Microdata Methods and Practice, Institute for Fiscal Studies.
    9. Shonosuke Sugasawa & Hisashi Noma, 2021. "Efficient screening of predictive biomarkers for individual treatment selection," Biometrics, The International Biometric Society, vol. 77(1), pages 249-257, March.
    10. Ruoxuan Xiong & Allison Koenecke & Michael Powell & Zhu Shen & Joshua T. Vogelstein & Susan Athey, 2021. "Federated Causal Inference in Heterogeneous Observational Data," Papers 2107.11732, arXiv.org, revised Apr 2023.
    11. Arne Henningsen & Guy Low & David Wuepper & Tobias Dalhaus & Hugo Storm & Dagim Belay & Stefan Hirsch, 2024. "Estimating Causal Effects with Observational Data: Guidelines for Agricultural and Applied Economists," IFRO Working Paper 2024/03, University of Copenhagen, Department of Food and Resource Economics.
    12. Bingnan Guo & Yuren Qian & Xinyan Guo & Hao Zhang, 2025. "Impact of Zero-Waste City Pilot Policies on Urban Energy Consumption Intensity: Causal Inference Based on Double Machine Learning," Sustainability, MDPI, vol. 17(11), pages 1-25, May.
    13. Hayakawa, Kazunobu & Keola, Souknilanh & Silaphet, Korrakoun & Yamanouchi, Kenta, 2022. "Estimating the impacts of international bridges on foreign firm locations: a machine learning approach," IDE Discussion Papers 847, Institute of Developing Economies, Japan External Trade Organization(JETRO).
    14. Davide Viviano & Jelena Bradic, 2019. "Synthetic learner: model-free inference on treatments over time," Papers 1904.01490, arXiv.org, revised Aug 2022.
    15. Naguib, Costanza, 2019. "Estimating the Heterogeneous Impact of the Free Movement of Persons on Relative Wage Mobility," Economics Working Paper Series 1903, University of St. Gallen, School of Economics and Political Science.
    16. Labro, Eva & Lang, Mark & Omartian, James D., 2023. "Predictive analytics and centralization of authority," Journal of Accounting and Economics, Elsevier, vol. 75(1).
    17. Rina Friedberg & Julie Tibshirani & Susan Athey & Stefan Wager, 2018. "Local Linear Forests," Papers 1807.11408, arXiv.org, revised Sep 2020.
    18. Ke-Lin Du & Rengong Zhang & Bingchun Jiang & Jie Zeng & Jiabin Lu, 2025. "Foundations and Innovations in Data Fusion and Ensemble Learning for Effective Consensus," Mathematics, MDPI, vol. 13(4), pages 1-49, February.
    19. Liangyuan Hu & Lihua Li, 2022. "Using Tree-Based Machine Learning for Health Studies: Literature Review and Case Series," IJERPH, MDPI, vol. 19(23), pages 1-13, December.
    20. Wu, Libo & Zhou, Yang, 2025. "Social norms and energy conservation in China," Resource and Energy Economics, Elsevier, vol. 82(C).

    More about this item

    NEP fields

    This paper has been announced in the following NEP Reports:

    Statistics

    Access and download statistics

    Corrections

    All material on this site has been provided by the respective publishers and authors. You can help correct errors and omissions. When requesting a correction, please mention this item's handle: RePEc:arx:papers:2604.06123. See general information about how to correct material in RePEc.

    If you have authored this item and are not yet registered with RePEc, we encourage you to do it here. This allows to link your profile to this item. It also allows you to accept potential citations to this item that we are uncertain about.

    If CitEc recognized a bibliographic reference but did not link an item in RePEc to it, you can help with this form .

    If you know of missing items citing this one, you can help us creating those links by adding the relevant references in the same way as above, for each refering item. If you are a registered author of this item, you may also want to check the "citations" tab in your RePEc Author Service profile, as there may be some citations waiting for confirmation.

    For technical questions regarding this item, or to correct its authors, title, abstract, bibliographic or download information, contact: arXiv administrators (email available below). General contact details of provider: http://arxiv.org/ .

    Please note that corrections may take a couple of weeks to filter through the various RePEc services.

    IDEAS is a RePEc service. RePEc uses bibliographic data supplied by the respective publishers.