IDEAS home Printed from https://ideas.repec.org/p/arx/papers/2504.08324.html

An Introduction to Double/Debiased Machine Learning

Author

Listed:
  • Achim Ahrens
  • Victor Chernozhukov
  • Christian Hansen
  • Damian Kozbur
  • Mark Schaffer
  • Thomas Wiemann

Abstract

This paper provides an introduction to Double/Debiased Machine Learning (DML). DML is a general approach to performing inference about a target parameter in the presence of nuisance functions: objects that are needed to identify the target parameter but are not of primary interest. Nuisance functions arise naturally in many settings, such as when controlling for confounding variables or leveraging instruments. The paper describes two biases that arise from nuisance function estimation and explains how DML alleviates these biases. Consequently, DML allows the use of flexible methods, including machine learning tools, for estimating nuisance functions, reducing the dependence on auxiliary functional form assumptions and enabling the use of complex non-tabular data, such as text or images. We illustrate the application of DML through simulations and empirical examples. We conclude with a discussion of recommended practices. A companion website includes additional examples with code and references to other resources.

Suggested Citation

  • Achim Ahrens & Victor Chernozhukov & Christian Hansen & Damian Kozbur & Mark Schaffer & Thomas Wiemann, 2025. "An Introduction to Double/Debiased Machine Learning," Papers 2504.08324, arXiv.org, revised Feb 2026.
  • Handle: RePEc:arx:papers:2504.08324
    as

    Download full text from publisher

    File URL: http://arxiv.org/pdf/2504.08324
    File Function: Latest version
    Download Restriction: no
    ---><---

    Other versions of this item:

    References listed on IDEAS

    as
    1. Qi Li & Jeffrey Scott Racine, 2006. "Nonparametric Econometrics: Theory and Practice," Economics Books, Princeton University Press, edition 1, volume 1, number 8355, December.
    2. Joshua D. Angrist & Jörn-Steffen Pischke, 2009. "Mostly Harmless Econometrics: An Empiricist's Companion," Economics Books, Princeton University Press, edition 1, number 8769, December.
    3. Stefan Wager & Susan Athey, 2018. "Estimation and Inference of Heterogeneous Treatment Effects using Random Forests," Journal of the American Statistical Association, Taylor & Francis Journals, vol. 113(523), pages 1228-1242, July.
    4. Wright, Marvin N. & Ziegler, Andreas, 2017. "ranger: A Fast Implementation of Random Forests for High Dimensional Data in C++ and R," Journal of Statistical Software, Foundation for Open Access Statistics, vol. 77(i01).
    5. Qi Li & Jeffrey Scott Racine, 2006. "Density Estimation, from Nonparametric Econometrics: Theory and Practice," Introductory Chapters, in: Nonparametric Econometrics: Theory and Practice, Princeton University Press.
    6. Paul S. Clarke & Annalivia Polselli, 2023. "Double Machine Learning for Static Panel Models with Fixed Effects," Papers 2312.08174, arXiv.org, revised Dec 2024.
    7. Andrews, Donald W K, 1994. "Asymptotics for Semiparametric Econometric Models via Stochastic Equicontinuity," Econometrica, Econometric Society, vol. 62(1), pages 43-72, January.
    8. James J. Heckman & Vytlacil, Edward J., 2007. "Econometric Evaluation of Social Programs, Part I: Causal Models, Structural Models and Econometric Policy Evaluation," Handbook of Econometrics, in: J.J. Heckman & E.E. Leamer (ed.), Handbook of Econometrics, edition 1, volume 6, chapter 70, Elsevier.
    9. Kaspar Wüthrich & Ying Zhu, 2023. "Omitted Variable Bias of Lasso-Based Inference Methods: A Finite Sample Analysis," The Review of Economics and Statistics, MIT Press, vol. 105(4), pages 982-997, July.
    10. Wüthrich, Kaspar & Zhu, Ying, 2023. "Omitted Variable Bias of Lasso-Based Inference Methods: A Finite Sample Analysis," University of California at San Diego, Economics Working Paper Series qt1gp6g9gm, Department of Economics, UC San Diego.
    11. Sylvia Klosin, 2021. "Automatic Double Machine Learning for Continuous Treatment Effects," Papers 2104.10334, arXiv.org.
    12. Jinyong Hahn, 1998. "On the Role of the Propensity Score in Efficient Semiparametric Estimation of Average Treatment Effects," Econometrica, Econometric Society, vol. 66(2), pages 315-332, March.
    13. Hansen, Christian & Kozbur, Damian, 2014. "Instrumental variables estimation with many weak instruments using regularized JIVE," Journal of Econometrics, Elsevier, vol. 182(2), pages 290-308.
    14. Kyle Colangelo & Ying-Ying Lee, 2020. "Double Debiased Machine Learning Nonparametric Inference with Continuous Treatments," Papers 2004.03036, arXiv.org, revised Sep 2023.
    15. Imbens,Guido W. & Rubin,Donald B., 2015. "Causal Inference for Statistics, Social, and Biomedical Sciences," Cambridge Books, Cambridge University Press, number 9780521885881, Enero-Abr.
    16. Chao, John C. & Swanson, Norman R. & Hausman, Jerry A. & Newey, Whitney K. & Woutersen, Tiemen, 2012. "Asymptotic Distribution Of Jive In A Heteroskedastic Iv Regression With Many Instruments," Econometric Theory, Cambridge University Press, vol. 28(1), pages 42-86, February.
    17. Friedman, Jerome H. & Hastie, Trevor & Tibshirani, Rob, 2010. "Regularization Paths for Generalized Linear Models via Coordinate Descent," Journal of Statistical Software, Foundation for Open Access Statistics, vol. 33(i01).
    18. van der Laan Mark J. & Rubin Daniel, 2006. "Targeted Maximum Likelihood Learning," The International Journal of Biostatistics, De Gruyter, vol. 2(1), pages 1-40, December.
    19. Sun, Liyang & Abraham, Sarah, 2021. "Estimating dynamic treatment effects in event studies with heterogeneous treatment effects," Journal of Econometrics, Elsevier, vol. 225(2), pages 175-199.
    20. Amilcar Velez, 2024. "On the Asymptotic Properties of Debiased Machine Learning Estimators," Papers 2411.01864, arXiv.org.
    21. Bekker, Paul A, 1994. "Alternative Approximations to the Distributions of Instrumental Variable Estimators," Econometrica, Econometric Society, vol. 62(3), pages 657-681, May.
    22. Vira Semenova & Victor Chernozhukov, 2021. "Debiased machine learning of conditional average treatment effects and other causal functions," The Econometrics Journal, Royal Economic Society, vol. 24(2), pages 264-289.
    23. Cun-Hui Zhang & Stephanie S. Zhang, 2014. "Confidence intervals for low dimensional parameters in high dimensional linear models," Journal of the Royal Statistical Society Series B, Royal Statistical Society, vol. 76(1), pages 217-242, January.
    24. Baker, Andrew C. & Larcker, David F. & Wang, Charles C.Y., 2022. "How much should we trust staggered difference-in-differences estimates?," Journal of Financial Economics, Elsevier, vol. 144(2), pages 370-395.
    Full references (including those not matched with items on IDEAS)

    Citations

    Citations are extracted by the CitEc Project, subscribe to its RSS feed for this item.
    as


    Cited by:

    1. Francis J. DiTraglia & Laura Liu, 2025. "Bayesian Double Machine Learning for Causal Inference," Papers 2508.12688, arXiv.org.

    Most related items

    These are the items that most often cite the same works as this one and are cited by the same works as this one.
    1. Alexandre Belloni & Victor Chernozhukov & Denis Chetverikov & Christian Hansen & Kengo Kato, 2018. "High-dimensional econometrics and regularized GMM," CeMMAP working papers CWP35/18, Centre for Microdata Methods and Practice, Institute for Fiscal Studies.
    2. Mark Kattenberg & Bas Scheer & Jurre Thiel, 2023. "Causal forests with fixed effects for treatment effect heterogeneity in difference-in-differences," CPB Discussion Paper 452, CPB Netherlands Bureau for Economic Policy Analysis.
    3. Agboola, Oluwagbenga David & Yu, Han, 2023. "Neighborhood-based cross fitting approach to treatment effects with high-dimensional data," Computational Statistics & Data Analysis, Elsevier, vol. 186(C).
    4. Rahul Singh & Liyuan Xu & Arthur Gretton, 2020. "Kernel Methods for Causal Functions: Dose, Heterogeneous, and Incremental Response Curves," Papers 2010.04855, arXiv.org, revised Oct 2022.
    5. Victor Chernozhukov & Denis Chetverikov & Mert Demirer & Esther Duflo & Christian Hansen & Whitney Newey & James Robins, 2018. "Double/debiased machine learning for treatment and structural parameters," Econometrics Journal, Royal Economic Society, vol. 21(1), pages 1-68, February.
    6. Ganesh Karapakula, 2023. "Stable Probability Weighting: Large-Sample and Finite-Sample Estimation and Inference Methods for Heterogeneous Causal Effects of Multivalued Treatments Under Limited Overlap," Papers 2301.05703, arXiv.org, revised Jan 2023.
    7. Victor Chernozhukov & Denis Chetverikov & Mert Demirer & Esther Duflo & Christian Hansen & Whitney Newey & James Robins, 2016. "Double/Debiased Machine Learning for Treatment and Causal Parameters," Papers 1608.00060, arXiv.org, revised Nov 2024.
    8. Lundberg, Ian & Brand, Jennie E. & Jeon, Nanum, 2022. "Researcher reasoning meets computational capacity: Machine learning for social science," SocArXiv s5zc8, Center for Open Science.
    9. Christian Hansen & Damian Kozbur & Sanjog Misra, 2016. "Targeted undersmoothing," ECON - Working Papers 282, Department of Economics - University of Zurich, revised Apr 2018.
    10. Michael Lechner, 2023. "Causal Machine Learning and its use for public policy," Swiss Journal of Economics and Statistics, Springer;Swiss Society of Economics and Statistics, vol. 159(1), pages 1-15, December.
    11. Aditya Ghosh & Dominik Rothenhausler, 2025. "Which Covariates to Adjust for? Specification-robust Causal Inference in Observational Studies," Papers 2505.08729, arXiv.org, revised Mar 2026.
    12. Lin, Zhexiao & Han, Fang, 2025. "On regression-adjusted imputation estimators of average treatment effects," Journal of Econometrics, Elsevier, vol. 251(C).
    13. Guido W. Imbens, 2015. "Matching Methods in Practice: Three Examples," Journal of Human Resources, University of Wisconsin Press, vol. 50(2), pages 373-419.
    14. Roth, Jonathan & Sant’Anna, Pedro H.C. & Bilinski, Alyssa & Poe, John, 2023. "What’s trending in difference-in-differences? A synthesis of the recent econometrics literature," Journal of Econometrics, Elsevier, vol. 235(2), pages 2218-2244.
    15. Xiduo Chen & Xingdong Feng & Antonio F. Galvao & Yeheng Ge, 2025. "Treatment Effects Inference with High-Dimensional Instruments and Control Variables," Papers 2503.20149, arXiv.org, revised Oct 2025.
    16. Mogstad, Magne & Torgovitsky, Alexander, 2024. "Instrumental variables with unobserved heterogeneity in treatment effects," Handbook of Labor Economics,, Elsevier.
    17. Susan Athey & Julie Tibshirani & Stefan Wager, 2016. "Generalized Random Forests," Papers 1610.01271, arXiv.org, revised Apr 2018.
    18. Sokolov, Boris, 2025. "Causal Estimands for Policy Evaluation and Beyond," SocArXiv 4vtpk_v1, Center for Open Science.
    19. Jelena Bradic & Stefan Wager & Yinchu Zhu, 2019. "Sparsity Double Robust Inference of Average Treatment Effects," Papers 1905.00744, arXiv.org.
    20. Alena Skolkova, 2023. "Instrumental Variable Estimation with Many Instruments Using Elastic-Net IV," CERGE-EI Working Papers wp759, The Center for Economic Research and Graduate Education - Economics Institute, Prague.

    More about this item

    JEL classification:

    • C14 - Mathematical and Quantitative Methods - - Econometric and Statistical Methods and Methodology: General - - - Semiparametric and Nonparametric Methods: General
    • C21 - Mathematical and Quantitative Methods - - Single Equation Models; Single Variables - - - Cross-Sectional Models; Spatial Models; Treatment Effect Models
    • C23 - Mathematical and Quantitative Methods - - Single Equation Models; Single Variables - - - Models with Panel Data; Spatio-temporal Models
    • C26 - Mathematical and Quantitative Methods - - Single Equation Models; Single Variables - - - Instrumental Variables (IV) Estimation

    NEP fields

    This paper has been announced in the following NEP Reports:

    Statistics

    Access and download statistics

    Corrections

    All material on this site has been provided by the respective publishers and authors. You can help correct errors and omissions. When requesting a correction, please mention this item's handle: RePEc:arx:papers:2504.08324. See general information about how to correct material in RePEc.

    If you have authored this item and are not yet registered with RePEc, we encourage you to do it here. This allows to link your profile to this item. It also allows you to accept potential citations to this item that we are uncertain about.

    If CitEc recognized a bibliographic reference but did not link an item in RePEc to it, you can help with this form .

    If you know of missing items citing this one, you can help us creating those links by adding the relevant references in the same way as above, for each refering item. If you are a registered author of this item, you may also want to check the "citations" tab in your RePEc Author Service profile, as there may be some citations waiting for confirmation.

    For technical questions regarding this item, or to correct its authors, title, abstract, bibliographic or download information, contact: arXiv administrators (email available below). General contact details of provider: http://arxiv.org/ .

    Please note that corrections may take a couple of weeks to filter through the various RePEc services.

    IDEAS is a RePEc service. RePEc uses bibliographic data supplied by the respective publishers.