IDEAS home Printed from https://ideas.repec.org/p/arx/papers/2603.02359.html

Estimating Visual Attribute Effects in Advertising from Observational Data: A Deepfake-Informed Double Machine Learning Approach

Author

Listed:
  • Yizhi Liu
  • Balaji Padmanabhan
  • Siva Viswanathan

Abstract

Digital advertising increasingly relies on visual content, yet marketers lack rigorous methods for understanding how specific visual attributes causally affect consumer engagement. This paper addresses a fundamental methodological challenge: estimating causal effects when the treatment, such as a model's skin tone, is an attribute embedded within the image itself. Standard approaches like Double Machine Learning (DML) fail in this setting because vision encoders entangle treatment information with confounding variables, producing severely biased estimates. We develop DICE-DML (Deepfake-Informed Control Encoder for Double Machine Learning), a framework that leverages generative AI to disentangle treatment from confounders. The approach combines three mechanisms: (1) deepfake-generated image pairs that isolate treatment variation; (2) DICE-Diff adversarial learning on paired difference vectors, where background signals cancel to reveal pure treatment fingerprints; and (3) orthogonal projection that geometrically removes treatment-axis components. In simulations with known ground truth, DICE-DML reduces root mean squared error by 73-97% compared to standard DML, with the strongest improvement (97.5%) at the null effect point, demonstrating robust Type I error control. Applying DICE-DML to 232,089 Instagram influencer posts, we estimate the causal effect of skin tone on engagement. Standard DML produces diagnostically invalid results (negative outcome R^2), while DICE-DML achieves valid confounding control (R^2 = 0.63) and estimates a marginally significant negative effect of darker skin tone (-522 likes; p = 0.062), substantially smaller than the biased standard estimate. Our framework provides a principled approach for causal inference with visual data when treatments and confounders coexist within images.

Suggested Citation

  • Yizhi Liu & Balaji Padmanabhan & Siva Viswanathan, 2026. "Estimating Visual Attribute Effects in Advertising from Observational Data: A Deepfake-Informed Double Machine Learning Approach," Papers 2603.02359, arXiv.org.
  • Handle: RePEc:arx:papers:2603.02359
    as

    Download full text from publisher

    File URL: http://arxiv.org/pdf/2603.02359
    File Function: Latest version
    Download Restriction: no
    ---><---

    References listed on IDEAS

    as
    1. Joshua D. Angrist & Jörn-Steffen Pischke, 2009. "Mostly Harmless Econometrics: An Empiricist's Companion," Economics Books, Princeton University Press, edition 1, number 8769, December.
    2. Dmitry Arkhangelsky & Susan Athey & David A. Hirshberg & Guido W. Imbens & Stefan Wager, 2021. "Synthetic Difference-in-Differences," American Economic Review, American Economic Association, vol. 111(12), pages 4088-4118, December.
    3. Sendhil Mullainathan & Jann Spiess, 2017. "Machine Learning: An Applied Econometric Approach," Journal of Economic Perspectives, American Economic Association, vol. 31(2), pages 87-106, Spring.
    4. Hartmann, Jochen & Heitmann, Mark & Siebert, Christian & Schamp, Christina, 2023. "More than a Feeling: Accuracy and Application of Sentiment Analysis," International Journal of Research in Marketing, Elsevier, vol. 40(1), pages 75-87.
    5. Greg Lewis & Vasilis Syrgkanis, 2020. "Double/Debiased Machine Learning for Dynamic Treatment Effects via g-Estimation," Papers 2002.07285, arXiv.org, revised Jun 2021.
    6. Sven Klaassen & Jan Teichert-Kluge & Philipp Bach & Victor Chernozhukov & Martin Spindler & Suhas Vijaykumar, 2024. "DoubleMLDeep: Estimation of Causal Effects with Multimodal Data," Papers 2402.01785, arXiv.org.
    7. Victor Chernozhukov & Denis Chetverikov & Mert Demirer & Esther Duflo & Christian Hansen & Whitney Newey & James Robins, 2018. "Double/debiased machine learning for treatment and structural parameters," Econometrics Journal, Royal Economic Society, vol. 21(1), pages 1-68, February.
    8. Vira Semenova & Victor Chernozhukov, 2021. "Debiased machine learning of conditional average treatment effects and other causal functions," The Econometrics Journal, Royal Economic Society, vol. 24(2), pages 264-289.
    Full references (including those not matched with items on IDEAS)

    Most related items

    These are the items that most often cite the same works as this one and are cited by the same works as this one.
    1. Jonathan Fuhr & Philipp Berens & Dominik Papies, 2024. "Estimating Causal Effects with Double Machine Learning -- A Method Evaluation," Papers 2403.14385, arXiv.org, revised Apr 2024.
    2. Wang, Jingyuan & Terabe, Shintaro & Yaginuma, Hideki, 2026. "Evaluating the long-term urban effects of high-speed rail in Japan: An integrated approach using synthetic difference-in-differences and double/debiased machine learning," Transportation Research Part A: Policy and Practice, Elsevier, vol. 203(C).
    3. Mark Kattenberg & Bas Scheer & Jurre Thiel, 2023. "Causal forests with fixed effects for treatment effect heterogeneity in difference-in-differences," CPB Discussion Paper 452, CPB Netherlands Bureau for Economic Policy Analysis.
    4. Arne Henningsen & Guy Low & David Wuepper & Tobias Dalhaus & Hugo Storm & Dagim Belay & Stefan Hirsch, 2024. "Estimating Causal Effects with Observational Data: Guidelines for Agricultural and Applied Economists," IFRO Working Paper 2024/03, University of Copenhagen, Department of Food and Resource Economics.
    5. Vira Semenova, 2020. "Generalized Lee Bounds," Papers 2008.12720, arXiv.org, revised May 2025.
    6. Max Vilgalys, 2023. "A Machine Learning Approach to Measuring Climate Adaptation," Papers 2302.01236, arXiv.org.
    7. Victor Chernozhukov & Carlos Cinelli & Whitney Newey & Amit Sharma & Vasilis Syrgkanis, 2021. "Long Story Short: Omitted Variable Bias in Causal Machine Learning," Papers 2112.13398, arXiv.org, revised May 2024.
    8. Connor Lennon & Edward Rubin & Glen Waddell, 2025. "Machine learning the first stage in 2SLS: Practical guidance from bias decomposition and simulation," Papers 2505.13422, arXiv.org.
    9. Riccardo Di Francesco, 2024. "Aggregation Trees," Papers 2410.11408, arXiv.org, revised Oct 2025.
    10. Dennis Shen & Peng Ding & Jasjeet Sekhon & Bin Yu, 2022. "Same Root Different Leaves: Time Series and Cross-Sectional Methods in Panel Data," Papers 2207.14481, arXiv.org, revised Oct 2022.
    11. Yucheng Yang & Zhong Zheng & Weinan E, 2020. "Interpretable Neural Networks for Panel Data Analysis in Economics," Papers 2010.05311, arXiv.org, revised Nov 2020.
    12. Linsen Zhu & Yan Li & Lei Suo & Haiying Feng, 2025. "The Impact of High-Quality Development of Foreign Trade on Marine Economic Quality: Empirical Evidence from Coastal Provinces and Cities in China," Sustainability, MDPI, vol. 17(17), pages 1-29, August.
    13. Daniel Goller, 2023. "Analysing a built-in advantage in asymmetric darts contests using causal machine learning," Annals of Operations Research, Springer, vol. 325(1), pages 649-679, June.
    14. Ben Deaner & Chen-Wei Hsiang & Andrei Zeleneev, 2025. "Inferring Treatment Effects in Large Panels by Uncovering Latent Similarities," Papers 2503.20769, arXiv.org, revised Mar 2025.
    15. Michael Lechner, 2023. "Causal Machine Learning and its use for public policy," Swiss Journal of Economics and Statistics, Springer;Swiss Society of Economics and Statistics, vol. 159(1), pages 1-15, December.
    16. Sina Akbari & Negar Kiyavash & AmirEmad Ghassami, 2025. "Semiparametric Triple Difference Estimators," Papers 2502.19788, arXiv.org, revised Sep 2025.
    17. Augusto Cerqua & Marco Letta & Gabriele Pinto, 2024. "On the (Mis)Use of Machine Learning with Panel Data," Papers 2411.09218, arXiv.org, revised May 2025.
    18. Songul Cinaroglu, 2020. "Modelling unbalanced catastrophic health expenditure data by using machine‐learning methods," Intelligent Systems in Accounting, Finance and Management, John Wiley & Sons, Ltd., vol. 27(4), pages 168-181, October.
    19. Yihui He & Fang Han, 2023. "On propensity score matching with a diverging number of matches," Papers 2310.14142, arXiv.org, revised Nov 2023.
    20. Martin Huber & Sarina Joy Oberhansli, 2026. "Difference-in-differences for mediation analysis using double machine learning," Papers 2602.23877, arXiv.org.

    More about this item

    NEP fields

    This paper has been announced in the following NEP Reports:

    Statistics

    Access and download statistics

    Corrections

    All material on this site has been provided by the respective publishers and authors. You can help correct errors and omissions. When requesting a correction, please mention this item's handle: RePEc:arx:papers:2603.02359. See general information about how to correct material in RePEc.

    If you have authored this item and are not yet registered with RePEc, we encourage you to do it here. This allows to link your profile to this item. It also allows you to accept potential citations to this item that we are uncertain about.

    If CitEc recognized a bibliographic reference but did not link an item in RePEc to it, you can help with this form .

    If you know of missing items citing this one, you can help us creating those links by adding the relevant references in the same way as above, for each refering item. If you are a registered author of this item, you may also want to check the "citations" tab in your RePEc Author Service profile, as there may be some citations waiting for confirmation.

    For technical questions regarding this item, or to correct its authors, title, abstract, bibliographic or download information, contact: arXiv administrators (email available below). General contact details of provider: http://arxiv.org/ .

    Please note that corrections may take a couple of weeks to filter through the various RePEc services.

    IDEAS is a RePEc service. RePEc uses bibliographic data supplied by the respective publishers.